Related articles 

TopDown Parser Construction Conjectures bart@cs.uoregon.edu (19930118) 
Re: TopDown Parser Construction Conjectures W.Purvis@daresbury.ac.uk (19930118) 
Recursive descent parsing is better than LL(1) jan@klikspaan.si.hhs.nl (19930122) 
Re: Recursive descent parsing is better than LL(1) tchannon@black.demon.co.uk (19930123) 
Re: Recursive descent parsing is better than LL(1) pjj@cs.man.ac.uk (Pete Jinks) (19930127) 
Newsgroups:  comp.compilers 
From:  jan@klikspaan.si.hhs.nl 
Organization:  Compilers Central 
Date:  Fri, 22 Jan 1993 16:41:50 GMT 
Keywords:  parse, LL(1), theory 
References:  9301122 9301123 
Barton Christopher Massey < bart@cs.uoregon.edu > writes:
> I will personally choose to use YACC or something similar for parsing until
> I can implement an LL(1) parser generator which will take a reasonably
> adhoc grammar and either transform it to an LL(1) grammar or report
> failure to do so, "most of the time".
> In order to construct such a tool, I would like to prove or disprove the
> following conjectures:
> LL(1) Transformability: There exists an algorithm
> which, given any grammar G which is unambiguous and
> describes an LL(1) language L(G), produces an LL(1)
> grammar G' for the language L(G).
> LL(1) Decidability: There exists an algorithm which,
> given any grammar G which is unambiguous and describes
> a language L(G), decides whether L(G) is an LL(1)
> language.
> After some thought, it still seems most likely to me that LL1T is true,
> but I'm skeptical about LL1D . I posted these conjectures to
> comp.compilers previously, but didn't get too far with them then. In a
> way, I'm surprised if such apparently important questions haven't been
> carefully studied, but I haven't found any very useful references yet.
Bill Purvis <W.Purvis@daresbury.ac.uk> writes:
> In Computer Journal, Vol 11, number 1, May 1968, J. Foster describes SID,
> a syntax improving program which accepts a fairly arbitrary BNF
> description and applies the neccesary transformations to produce an LL(1)
> grammar (assuming that the language is LL(1). I think that that fully
> covers the first conjecture.
> The program is pretty comprehensive and while it might not cover all
> cases, I suspect that in practical terms it probably covers the second
> conjecture.
This discussion is a bit blurred by the use of the clauses `given any
grammar G' and `reasonably adhoc grammar'. I agree with Bill Purvis that
SID does a good job on some fuzzy set of `reasonably good adhoc
grammars'.
The general case `given any unambigous grammar G' cannot be solved by SID
if it only uses the transformations substitution, elimination of left
recursion and factorization.
I present a simple contraexample:
Consider the language defined by the next CFG:
S > A
S > B
A > `x' A `y'
A > `a'
B > `x' B `z'
B > `b'
SID uses elimination of left recursion, factorisation and substitution to
search for a better grammar.
The problem here is that A and B have `x' in common in their FIRSTsets.
So let's substitute A and B in the Srules giving:
S > `x' A `y'
S > `a'
S > `x' B `z'
S > `b'
A > `x' A `y'
A > `a'
B > `x' B `z'
B > `b'
Two Srules now start both with `x' so we have to factorize them giving:
S > `x' S1
S > `a'
S > `b'
A > `x' A `y'
A > `a'
B > `x' B `z'
B > `b'
S1 > A `y'
S1 > B `z'
Now after solving the problem with the Srules, we get the same problem
with the S1rules.
We can repeat both substitution and factiorisation on S1 giving:
S > `x' S1
S > `a'
S > `b'
A > `x' A `y'
A > `a'
B > `x' B `z'
B > `b'
S1 > `x' S2
S1 > `a'
S1 > `b'
S2 > A `y' `y'
S2 > B `z' `z'
Giving rise to the same kind of conflict, but now on S2rules. Obviously
the proces of substitution and factorization can be repeated
indefinitedly.
The loop suggests the conjecture of decidability doesn't hold.
Foster expresses this problem with `SID may also report failure because
its attempt to transform the grammar causes it to loop.'
But let's return to the initial discussion: recursive descent versus YACC.
Maybe we underestimate the power of recursive descent technique if only we
could get rid of the idea each nonterminal should be mapped to one
function recognizing the right sides of its productions.
Let's try a different approach: Each function recognizes right sides of
one or more nonterminals, returning as a result the kind(s) of
nonterminals recognized.
Let's return to the example grammar above:
S > A
S > B
A > `x' A `y'
A > `a'
B > `x' B `z'
B > `b'
As we cannot discern A from B let's integrate them into one function AorB.
This should answer the question `What did you see' with : neither (call
this an error), A, B, or Both.
AorB might be modelled with a statediagram like:
++
+A>+ `y' +y> result is A
 ++
++ +++ =+
>+ `x' +x>+ AorB +B+ `z' +z> result is B
++ ++++ ++
  ++x> result is A
V +Both+ `y' or z' +
neither ++y> result is B

conflicting inputs +> result is neither
As AorB has no Both exit, this branch could be pruned from the diagram.
In C this would look like:
#define neither 0
#define Both 1
#define A 2
#define B 3
char lookahead;
int match(char c)
{
if c == lookahead)
{
lookahead = getchar();
return 1;
}
else
return 0;
}
int AorB()
{
int r;
if (match('x'))
{
switch (AorB())
{
case neither:
return neither;
case A:
if (match('y'))
return A;
else
return neither;
case B:
if (match('z'))
return B;
else
return neither;
case Both: /* omitted */
}
}
else
return neither;
}
Note: This approach is topdown and recursive descent, but not predictive
parsing! I suppose others have figured this out before. I have no label
to stick upon it. Maybe someone else on the net.
Regards,
Jan Schramp

Jan Schramp, Haagse Hogeschool  Sector Informatica. Louis Couperusplein 19,
2514 HP Den Haag, Netherlands. Email: jan@si.hhs.nl

Return to the
comp.compilers page.
Search the
comp.compilers archives again.