29 Nov 2001 23:06:53 -0500

Related articles |
---|

Computing "first" in LR(1) parsing remove.haberg@matematik.su.se (2001-11-26) |

Re: Computing "first" in LR(1) parsing kan@cs.tu-berlin.de (Sönke Kannapinn) (2001-11-29) |

Re: Computing "first" in LR(1) parsing chrisd@reservoir.com (2001-11-29) |

Re: Computing "first" in LR(1) parsing max@gustavus.edu (2001-11-29) |

Re: Computing "first" in LR(1) parsing kgw-news@stiscan.com (2001-11-29) |

Re: Computing "first" in LR(1) parsing remove.haberg@matematik.su.se (2001-12-03) |

From: | chrisd@reservoir.com (Chris Dodd) |

Newsgroups: | comp.compilers |

Date: | 29 Nov 2001 23:06:53 -0500 |

Organization: | Posted via Supernews, http://www.supernews.com |

References: | 01-11-123 |

Keywords: | parse, LR(1) |

Posted-Date: | 29 Nov 2001 23:06:53 EST |

[posted and mailed]

remove.haberg@matematik.su.se (Hans Aberg) wrote in <01-11-

123@comp.compilers>:

*>I want to compute the function "first" of LR(1) parsing: If x is a*

*>string of grammar symbols, first(x) is the set of terminals that begin*

*>the strings derivable from x with respect to the given grammar, plus*

*>the empty string, if derivable as well. In Aho et al, "Compilers", sec*

*>4.4, p.189, 3, it says that if x is a nonterminal and there is a*

*>production x -> y1 ... yk, then to first(x) one should add first(yi)*

*>if the empty string is part of first(y1), ..., first(y{i-1}).*

*>*

*>But it does not say what happens if this procedure recurses, that is,*

*>if say when computing first(y1), one recurses back via some other*

*>relations to first(x). If that happens, is that not a LR(1) grammar,*

*>or how should this situation be handled?*

You repeat "until no more terminals terminals or epsilon can be added

to any FIRST set". What this means is that you have to repeatedly

re-evaluate all the FIRST sets until none of them change. This is

generally referred to as iterating until a least fixed point is reached.

Another way of thinking about it is that the rules given in the

Dragon book actually give you a set of equations for the FIRST sets,

with FIRST(X) defined as the union of various terminals and FIRST(Y)

for various non-terminals. You then need to solve this set of

simultaneous equations.

You do that by starting with FIRST[0](X) == {} for all X, then

calculate FIRST[i](X) in terms of FIRST[i-1]. You start with i == 1,

and just increment i until you find FIRST[i](X) == FIRST[i-1](X)

for all X, and then you are done.

Of course, this can be improved quite a bit efficiency-wise by choosing

the right order in which to recompute the FIRST[i] sets, and by adding

symbols to the sets rather than recomputing them from scratch. You

can do this because FIRST[i](X) will always be a superset of

FIRST[i-1](X) for all i and all X. This (and the fact that all the

sets are finite) also guarentees that the above will terminate.

Chris Dodd

chrisd@reservoir.com

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.