14 Nov 1996 21:52:12 -0500

Related articles |
---|

Beginner help with LALR(1) closure kentr@rollinssoft.com (Kent Rollins) (1996-11-12) |

Re: Beginner help with LALR(1) closure dlester@cs.man.ac.uk (1996-11-14) |

Re: Beginner help with LALR(1) closure grout@sp55.csrd.uiuc.edu (1996-11-14) |

Re: Beginner help with LALR(1) closure compres@world.std.com (1996-11-14) |

Re: Beginner help with LALR(1) closure farzu@cs.tamu.edu (Francisco Arzu) (1996-11-14) |

Re: Beginner help with LALR(1) closure adrian@dcs.rhbnc.ac.uk (1996-11-19) |

Re: Beginner help with LALR(1) closure salomon@silver.cs.umanitoba.ca (1996-11-19) |

Re: Beginner help with LALR(1) closure gianni@engr.sgi.com (Gianni Mariani) (1996-12-03) |

Re: Beginner help with LALR(1) closure icedancer@ibm.net (1996-12-07) |

[1 later articles] |

From: | grout@sp55.csrd.uiuc.edu (John R. Grout) |

Newsgroups: | comp.compilers |

Date: | 14 Nov 1996 21:52:12 -0500 |

Organization: | Center for Supercomputing R and D, UIUC |

References: | 96-11-080 |

Keywords: | LALR |

In-reply-to: | Kent Rollins's message of 12 Nov 1996 22:04:02 -0500 |

Kent Rollins <kentr@rollinssoft.com> writes:

*> I'm working on writing my own LALR(1) parser table generator and*

*> I'm a little stuck. I'm doing this because I want to understand*

*> the parsing process. Any tips would be greatly appreciated.*

*>*

*> I'm using 'Compiler Design in C' [Holub]. I'm currently working*

*> on performing LALR(1) closure on LALR(1) states. I've got 2*

*> questions:*

*>*

*> 1. I've downloaded a few YACC-able grammars and I've noticed that*

*> they all have left-recusive grammars like (A) below.*

Because, in an LR grammar, they are more efficient (the parsing stack

is shorter).

*> When I run this thru my generator, these cause infinite recursion.*

Sounds like a bug...

*> I can't find anything in Holub about how the LALR(1) generator should*

*> handle left-recursion. In order to get around the problem, I used a*

*> technique from a previous chapter which eliminates left-recursion (B)*

*> from LL grammars. I would like to know how the LALR(1) closure process*

*> avoids left recursion.*

When in doubt, go back to the source... a classic paper on computing

LALR(1) lookahead sets is by Frank DeRemer and Thomas Pennello,

"Efficient Computation of LALR(1) Look-Ahead Sets, ACM Transactions on

Programming Languages and Systems, Volume 4, No 4 (October 1982),

pp. 615-649... they were among the first to give a constructive

definition which described _how_ to build LALR(1) lookahead sets

efficiently... before their work (and some similar work done at the

same time), algorithms either used LALR subsets or (as did YACC-like

critters) performed potentially _very_ time-consuming and bug prone

iteration to a fixed point (program the iteration slightly

incorrectly, and you could easily loop forever).

My bachelor's thesis parser generator (back in 1981... sigh) was

orders of magnitude faster than the YACC-like critters of the day on

complicated language grammars because it used DeRemer and Pennello's

algorithm (as described in conference papers... the "dead tree"

version cited above was printed after I graduated).

DeRemer and Pennello's algorithm computes LALR(1) lookahead sets in

three simple steps which break down the ways terminal symbols can be

read after a reduction is performed:

1. Immediately (those read in the first parser state visited after the

reduction).

2. Immediately after one or more trivial reductions (to epsilon) with no

intervening reads are performed.

3. Immediately after one or more general reductions (either to epsilon or

not) with no intervening reads are performed.

The "direct read" sets associated with the first step are formed by

inspection of the LR(0) parser, while the sets associated with the

second (the reads sets) and third (the final lookahead sets) steps are

performed using a algorithm by Tarjan (as modified by Eve and

Kurki-Suonio) to compute the transitive closure of a relation (and

associated set unions) which _can't_ get into an infinite loop, no

matter how tangled the underlying relation... the algorithm simply

detects a non-trivial strongly connected component in the underlying

relation and unions together the associated sets (note that there

_can_ be cycles in the relation underlying the third step for

unambiguous grammars... but, as the authors demonstrate, a cycle in

the relation underlying the second step means the grammar is not LR(k)

for any k).

--

John R. Grout Center for Supercomputing R & D j-grout@uiuc.edu

Coordinated Science Laboratory University of Illinois at Urbana-Champaign

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.