Re: Grammar LR(1) but not LALR(1)

ANDREI Stefan <stefan@infoiasi.ro>
8 Nov 2003 01:35:25 -0500

From comp.compilers

Related articles
Grammar LR(1) but not LALR(1) jm@bourguet.org (Jean-Marc Bourguet) (2003-10-31)
Re: Grammar LR(1) but not LALR(1) oliver@zeigermann.de (Oliver Zeigermann) (2003-11-02)
Re: Grammar LR(1) but not LALR(1) cfc@shell01.TheWorld.com (Chris F Clark) (2003-11-02)
*Re: Grammar LR(1) but not LALR(1) stefan@infoiasi.ro (ANDREI Stefan)* (2003-11-08)**
Re: Grammar LR(1) but not LALR(1) jm@bourguet.org (Jean-Marc Bourguet) (2003-11-08)
Re: Grammar LR(1) but not LALR(1) RLake@oxfam.org.uk (2003-11-11)

| List of all articles for this month |

From:	ANDREI Stefan <stefan@infoiasi.ro>
Newsgroups:	comp.compilers
Date:	8 Nov 2003 01:35:25 -0500
Organization:	Compilers Central
Keywords:	LALR, parse, question
Posted-Date:	08 Nov 2003 01:35:25 EST

> It is well known that there are grammars which are LR(1) but not
> LALR(1). For exemple:
> S -> aEa | bEb | aFb | bFa
> E -> e
> F -> e
> It is possible to transform this grammar to make it LALR(1) by
> duplicating productions. For exemple:
> S -> a E a | b E' a | a F b | b F' a
> E -> e
> E' -> e
> F -> e
> F' -> e
> As a matter of fact, only one duplication is needed to make the
> grammar LALR(1).
> I'm convinced that such a transformation (contructing an LALR(1)
> grammar by duplicating productions) is possible for every LR(1)
> grammars, such duplication preventing the merge of the states LR(1)
> automaton.
> A collegue is not convinced and try to come up -- without success
> until now -- with a language which would have a LR(1) grammar but no
> LALR(1) one.
> Could someone point me to a counter example for my claim or a proof of
> its correctness?

    Dear Jean-Marc,

Your question is interesting. I have a good news and a bad news. The
good news is that your construction is correct! This is due to the
fact that this "non-terminal duplication" eliminates the reduce
conflicts in the new CFG grammar since the kernel of the items which
previously were generated such conflicts are no longer in the prefix
automaton. In other words, the states which previously merged together
(and generate the reduce conflict in the LALR(1) automaton) cannot be
merged after the "non-terminal duplication" since they have different
kernels!

    Now the bad news. The LALR(1) grammars were invented in 1969 in an
effort to do practical compilation since the prefix automaton for a
LR(1) CFG may be quite huge! Hence many today compiler-compilers are
using LALR(1) grammars instead of LR(1). That is, the LALR(1) prefix
automaton has the same number of states like LR(0) prefix automaton,
and almost the same power as LR(1) grammars, too. But the
"non-terminal duplication" will not have this advantage of reducing
the size of the prefix automaton (this will the same as the initial
LR(1) automaton). A more even worst news is that you need a bigger
GoTo table to specify the reductions for the new generated
non-terminal symbols. For example, in your proposed CFG (by the way,
there is a typo: S -> b E' a should be S -> b E' b) the initial LR(1)
grammar generates a prefix automaton with 11 states. By merging the
states {(E->e.,{a}), (F->e.,{b})} and {(E->e.,{b}), (F->e.,{a})} into
{(E->e.,{a,b}), (F->e.,{a,b})}, the LALR(1) prefix automaton has only
10 states, but it generates a reduce-reduce conflict, so the CFG
grammar is not LALR(1). As we mention earlier, the "non-terminal
duplication" technique transforms the old non-LALR(1) CFG into an
equivalent LALR(1) grammar, but the prefix automaton will have 11
states and two (although one suffices) new non- variables. So, in
general, I am afraid to tell you that there is no practical advantage
of such technique because: 1. the prefix automaton of the new LALR(1)
grammar will have the same number of states as the initial LR(1)
automaton 2. the new GoTo table will be bigger since it has to store
the new non-variable symbols 3. it is known that LR(1) parsing is
faster than LALR(1) parsing. Here I mean that in case of
non-acceptance of the input word, the LR(1) parser may detect this
with no reductions, whereas the LALR(1) parser can do many reductions
until it gets the same answer!

    Here are some useful references:
[DR69] DeRemer, F.L.: Practical translators for LR(k) languages. Ph.D.
Thesis, MIT, Cambridge, 1969

[DRP82] DeRemer, F.L., Pennello, T.: Efficient Computation of LALR(1)
Look-Ahead Set. TOPLAS, vol. 4, no. 4, october 1982

    Let me conclude by thanking for the question. If you want to know
more about these things (proofs of 1, 2, 3) contact me directly or via
compilers@iecc.com

    Best wishes,
      Stefan

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Grammar LR(1) but not LALR(1)

ANDREI Stefan <stefan@infoiasi.ro>8 Nov 2003 01:35:25 -0500

ANDREI Stefan <stefan@infoiasi.ro>
8 Nov 2003 01:35:25 -0500