Re: Collecting look ahead set

megatest!djones@decwrl.dec.com (Dave Jones)
23 Feb 89 21:47:42 GMT

          From comp.compilers

Related articles
Collecting look ahead set craig@comp.lancs.ac.uk (Craig) (1989-02-21)
Re: Collecting look ahead set megatest!djones@decwrl.dec.com (1989-02-23)
Re: Collecting look ahead set eachus@mbunix.mitre.org (1989-03-07)
Re: Collecting look ahead set schwartz@shire.cs.psu.edu (1989-03-09)
Re: Collecting look ahead set megatest!djones@decwrl.dec.com (1989-03-10)
| List of all articles for this month |

From: megatest!djones@decwrl.dec.com (Dave Jones)
Newsgroups: comp.compilers
Summary: The posted lookahead-set generator is incorrect.
Date: 23 Feb 89 21:47:42 GMT
References: <3353@ima.ima.isc.com>
Organization: Megatest Corporation, San Jose, Ca

> The following two programs are usefull in the production of yacc based
> grammars. The first will collect a follow set of acceptable tokens into
> an array (yylkahead)


[...]


I must congratulate Mr. Wylie's intrepidness in undertaking to hack the
formidably obtuse y.tab.c. It's a task which I have attempted on
occasion, always with awestricken wonder at the skill with which the
short little file has been obfuscated.


Having said that, I must point out that the patch cannot work.
Sometimes there will be legal next tokens which don't get
tabulated. I can suggest an algorithm which would work, though.


The problem is caused by default reductions.


Suppose yacc has just discovered a syntax error.


After yacc (unknowingly) uncovered the first illegal token, it may have
made several default reductions, removing states from the stack, and
finally pushing one which could only shift tokens. Only at that point
would it discover that it is blocked. It could then tabulate the
tokens which were shiftable from that state, but it would not then know
about tokens which might have been shiftable in those states which
it removed when doing the erroneous default reductions.


Consider the following:


%%
goal: fizzle 'x' | foozle 'y' ;
fizzle: 'F';
foozle: 'F';
%%


Consider the state after yacc has seen 'F', and suppose for sake
of example that yacc uses "fizzle: F" as the default production.


State n:


      fizzle : 'F' _
      foozle : 'F' _


                  on 'y': reduce 'F' to foozle
default: reduce 'F' to fizzle


In this state, if a 'z' (an error) is next, yacc will erroneously reduce
anyway, and enter a state like this:


      goal: fizzle _ 'x'


                on 'x': shift
                default: error


Here we check for all legal tokens, and find that only 'x' is legal.
But in fact, both 'x' and 'y' would be legal, if we could undo the
erroneous default reduction.




...




So, what to do? How's this: At the start, and every time any action
other than a default reduction or error action gets done, we empty a
set called "yy_default_reduction_states", er, excuse me, "yydfrs".
Every time a default reduction or error-action takes place, we add the
state on top of the stack to the set.


After the syntax error is detected, we go through the set of
states, making a set of all tokens which could have caused
a shift *or* a reduce in any of those states, but didn't. Those are
the ones which would be legal next.




The trick would be to encode the yydfrs set efficiently enough so as
not to slow the parser significantly.


Anybody care to volunteer a go at it?
[From megatest!djones@decwrl.dec.com (Dave Jones)]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.