Re: Intricate problem with scannerless LALR(1) parser

"Paul B Mann" <paul@paulbmann.com>
Wed, 25 Jun 2008 16:25:51 -0600

          From comp.compilers

Related articles
Intricate problem with scannerless LALR(1) parser mailings@jmksf.com (2008-06-06)
Re: Intricate problem with scannerless LALR(1) parser kamalpr@hp.com (kamal) (2008-06-08)
Re: Intricate problem with scannerless LALR(1) parser GeniusVczh@gmail.com (vczh) (2008-06-09)
Re: Intricate problem with scannerless LALR(1) parser paul@paulbmann.com (Paul B Mann) (2008-06-25)
Re: Intricate problem with scannerless LALR(1) parser gah@ugcs.caltech.edu (glen herrmannsfeldt) (2008-06-26)
Re: Intricate problem with scannerless LALR(1) parser paul@paulbmann.com (Paul B Mann) (2008-06-26)
Re: Intricate problem with scannerless LALR(1) parser parsersinc@earthlink.net (SLK Mail) (2008-06-27)
| List of all articles for this month |

From: "Paul B Mann" <paul@paulbmann.com>
Newsgroups: comp.compilers
Date: Wed, 25 Jun 2008 16:25:51 -0600
Organization: Compilers Central
References: 08-06-010
Keywords: parse
Posted-Date: 25 Jun 2008 17:22:42 EDT

> However, that keyword-feature has one side effect which I would
> discuss on the mailing list. Given the following grammar:
>
> start: a
> a: b 'XX'
> b: c | '[' b ']'
> c: 'X' | c 'X'
>
> [Your grammar is ambiguous. To see where, replace "XX" with xx and
> define it like this: xx: "X" "X"
>
> Assuming you want to use normal tokenizing rules, your "X" token is
> really "X followed by something other than a letter or digit, and if
> the something is white space, skip over the white space. Oh, and skip
> comments, too." Now you know why we use separate lexer and parser
> generators, because they need separate state machines to keep the
> parser grammar frome exploding. -John]


I agree with John.


XX is valid, but ambiguous, either a keyword or two X's.


A normal compiler front end would determine that XX is a keyword first
by doing a keyword lookup, before passing the token (XX) to the parser.


This would require a separate lexer that creates tokens to be looked up
before parsing them.


This is good example, showing one of the disadvantages of scannerless
parsing systems.


Languages like PL/I which have no reserved words would probably be
a nightmare for scannerless parsing systems.




Paul B Mann
http://lrgen.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.