Re: Preview in grammar
12 May 2006 00:48:49 -0400

          From comp.compilers

Related articles
[6 earlier articles]
Re: Preview in grammar (Ulrich Frank) (2006-05-09)
Re: Preview in grammar (Chris F Clark) (2006-05-11)
Re: Preview in grammar (Ulrich Frank) (2006-05-11)
Re: Preview in grammar (Ulrich Frank) (2006-05-11)
Re: Preview in grammar (Detlef Meyer-Eltz) (2006-05-12)
Re: Preview in grammar (Chris F Clark) (2006-05-12)
Re: Preview in grammar (2006-05-12)
Re: Preview in grammar (2006-05-12)
Re: Preview in grammar (2006-05-15)
Re: Preview in grammar (Chris F Clark) (2006-05-15)
| List of all articles for this month |

Newsgroups: comp.compilers
Date: 12 May 2006 00:48:49 -0400
References: 06-05-01106-05-032
Keywords: parse, LALR
Posted-Date: 12 May 2006 00:48:49 EDT

Ulrich Frank wrote:
> LALR-Parser merge equal states and therefore the comp-state of 1) and
> the comp-state of 2) get merged (to state 44). My idea was to look at
> the possible input tokens in this state. But because of this merging I
> cannot differ between comp-state of 1) and comp-state of 2). If the
> input satisfies rule 1), e.g. "TODAY <=" then the lookahead token list
> includes ATTR, TODAY, NOW, LETTER and unfortunately INTVAL. But INTVAL
> is not appropriate at this position of the query input.
> How can this be solved with an additional stack?

An overview of the situation is this:

LALR tables don't have enough states to allow a STATIC check of which
terminal symbols are legal as the NEXT token.

Canonical LR tables (not minimal LR tables) have enough states to
allow a STATIC check.

LALR tables require a DYNAMIC check to determine the NEXT tokens
that are legal.

LALR tables cannot be modified to permit a DYNAMIC check simply
because there are not enough states (i.e. some states have been
merged without regard for this concept).

If one tried to do this with LALR tables, conflicting sets of NEXT
would be encountered and states would have to be duplicated (split as
some authors say) and you would end up with canonical LR tables.

So why not just build the canonical LR tables in the first place
of doing it in a backward way.

The LALR dynamic check algorithm that I used supplied to the
parser a known <error> symbol that does not exist in the grammar,
but had been accounted for in the creation of the parser tables.
That's the way my current product works also. It is called LRgen.

The parser has a variable called current state. When the <error>
symbol is encountered, one or more reductions may take place,
eventually causing a nonterminal transition to a state where there
are no default reductions allowed. I call this is an ERROR state.

At this ERROR state, the reductions are undone (they were put on
a stack as they occurred) and the current state number is restored
from the saved "current state" variable which always gives the
state before reductions start occurring.

After the original situation is restored, before the <error> was seen,
you try to parse with all terminal symbols of the grammar. For
each one you have to try a parse and then restore the parser situation
back to the original before-<error> state.

You make a list of all the terminal symbols that allow the parser
to continue parsing (request another terminal symbol).

You may have to keep more than one stack: A stack for the
reductions and a stack for the state where the reduction takes place.
You will have to use these stacks to rebuild the parser situation
back to the original place where the <error> occurred.

I don't have the code in front of me at this time, so I may have left
out a few details. At least, I left out the null-production problems,
which complicate the issue.

I don't supply this code in the FREE version of LRgen because it
tends to confuse the parser algorithm for those who are trying to
learn about LALR parsers.

I can supply this code in the professional version if someone needs

Any other questions ... just ask.

Paul Mann

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.