Re: Computing "first" in LR(1) parsing

chrisd@reservoir.com (Chris Dodd)
29 Nov 2001 23:06:53 -0500

          From comp.compilers

Related articles
Computing "first" in LR(1) parsing remove.haberg@matematik.su.se (2001-11-26)
Re: Computing "first" in LR(1) parsing kan@cs.tu-berlin.de (Sönke Kannapinn) (2001-11-29)
Re: Computing "first" in LR(1) parsing chrisd@reservoir.com (2001-11-29)
Re: Computing "first" in LR(1) parsing max@gustavus.edu (2001-11-29)
Re: Computing "first" in LR(1) parsing kgw-news@stiscan.com (2001-11-29)
Re: Computing "first" in LR(1) parsing remove.haberg@matematik.su.se (2001-12-03)
| List of all articles for this month |

From: chrisd@reservoir.com (Chris Dodd)
Newsgroups: comp.compilers
Date: 29 Nov 2001 23:06:53 -0500
Organization: Posted via Supernews, http://www.supernews.com
References: 01-11-123
Keywords: parse, LR(1)
Posted-Date: 29 Nov 2001 23:06:53 EST

[posted and mailed]


remove.haberg@matematik.su.se (Hans Aberg) wrote in <01-11-
123@comp.compilers>:
>I want to compute the function "first" of LR(1) parsing: If x is a
>string of grammar symbols, first(x) is the set of terminals that begin
>the strings derivable from x with respect to the given grammar, plus
>the empty string, if derivable as well. In Aho et al, "Compilers", sec
>4.4, p.189, 3, it says that if x is a nonterminal and there is a
>production x -> y1 ... yk, then to first(x) one should add first(yi)
>if the empty string is part of first(y1), ..., first(y{i-1}).
>
>But it does not say what happens if this procedure recurses, that is,
>if say when computing first(y1), one recurses back via some other
>relations to first(x). If that happens, is that not a LR(1) grammar,
>or how should this situation be handled?


You repeat "until no more terminals terminals or epsilon can be added
to any FIRST set". What this means is that you have to repeatedly
re-evaluate all the FIRST sets until none of them change. This is
generally referred to as iterating until a least fixed point is reached.


Another way of thinking about it is that the rules given in the
Dragon book actually give you a set of equations for the FIRST sets,
with FIRST(X) defined as the union of various terminals and FIRST(Y)
for various non-terminals. You then need to solve this set of
simultaneous equations.


You do that by starting with FIRST[0](X) == {} for all X, then
calculate FIRST[i](X) in terms of FIRST[i-1]. You start with i == 1,
and just increment i until you find FIRST[i](X) == FIRST[i-1](X)
for all X, and then you are done.


Of course, this can be improved quite a bit efficiency-wise by choosing
the right order in which to recompute the FIRST[i] sets, and by adding
symbols to the sets rather than recomputing them from scratch. You
can do this because FIRST[i](X) will always be a superset of
FIRST[i-1](X) for all i and all X. This (and the fact that all the
sets are finite) also guarentees that the above will terminate.


Chris Dodd
chrisd@reservoir.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.