|LALR(1) Lookahead calculation email@example.com (1998-09-18)|
|Re: LALR(1) Lookahead calculation firstname.lastname@example.org (Chris F Clark) (1998-09-22)|
|Re: LALR(1) Lookahead calculation email@example.com (1998-09-22)|
|Re: LALR(1) Lookahead calculation corbett@lupa.Eng.Sun.COM (1998-09-24)|
|Re: LALR(1) Lookahead calculation firstname.lastname@example.org (1998-09-26)|
|Re: LALR(1) Lookahead calculation email@example.com (Chris F Clark) (1998-09-26)|
|From:||firstname.lastname@example.org (Dick Grune)|
|Date:||26 Sep 1998 01:05:15 -0400|
|Organization:||Fac. Wiskunde & Informatica, VU, Amsterdam|
email@example.com (Chris Hargrove) writes:
>I've been mucking around with making a C++ parser generation library
>for the hell of it (so I could calculate parser state machines on the
>fly based on string-passed rules rather than stick with pre-generated
>output from an external tool). I've decided to limit it to LALR(1)
>since it seems to allow enough grammar complexity.
>Right now the generator is LR(0), and to that end it works like a
>champ. But I've run into problems adapting it to calculate the
>lookahead tokens required for LALR(1). I've referred to both [Aho]
>and [Holub] and while both have several routines for lookahead
>determination, they all seem to depend on either organizing the
>generator's kernel item closure routine for LR(1) (including the
>calculation of first sets during that operation), or doing many passes
>over the kernel items to determine spontaneous and propogated
>lookaheads. All these algorithms seem to come from the point of view
>of a generator that must output precomputed action/goto tables which
>are entirely deterministic from that point on.
All the deterministic algorithms in parsing are the result of
optimized precomputation of item sets, which makes for nice theory and
is useful when you have a fixed grammar and a lot of text to parse
But there is no need to do the precomputation and the optimization.
You can just as well construct the item sets on the fly, while
parsing. This is quite simple for LR(0) (you have most of the code
already) and not difficult for LR(1). This way you never construct
any table, just all the sets you need to parse the text you happen to
have. And there are as many of these as there are tokens in your
text, so the whole algorithm is linear. For an example see our book,
downloadable through http://www.cs.vu.nl/~dick/PTAPG.html.
You cannot do this easily for LALR(1) since that results from a
transformation on the finished LR(1) automaton. And then, perhaps you
can, now that I come to think of it, but it seems to me that just
doing LR(1) would be easier and more effective. Also, once you have
the LR(1) algorithm in place, making it LR(2) should be a breeze.
Given the range of your applications that might be useful too.
Dick Grune | email: firstname.lastname@example.org
Vrije Universiteit | ftp://ftp.cs.vu.nl/pub/dick
de Boelelaan 1081 | http://www.cs.vu.nl/~dick
1081 HV Amsterdam, the Netherlands | tel: +31 20 444 7744
Return to the
Search the comp.compilers archives again.