|Fixing Berkeley YACC ( Re: PD Yacc slows down ) djones@megatest (1989-11-15)|
|From:||djones@megatest (Dave Jones)|
|Date:||15 Nov 89 23:33:42 GMT|
|Organization:||Megatest Corporation, San Jose, Ca|
This article is about fixing yacc. But first a small digression.
>From article <1989Nov15.firstname.lastname@example.org>, by corbett@ernie.Berkeley.EDU (Robert Corbett):
> I have nearly completed the next version of Berkeley Yacc. Some of the
> changes were to make error recovery more effective (though Yacc error
> recovery is never very effective).
I think it can be very effective indeed. If you ask somebody why
they prefer LL parsing, they will say, "Better error recovery."
If you ask somebody else why they prefer LR parsing, they will say,
"Better error-recovery." So far as I can tell, it's a religious
But that's not what this response is about.
(Still, I wonder how much of the bad press LR parsing has received on the
matter of error-recovery is directly the result of the YACC default
> If a state contains a reduction where the lookahead
> is the error token, no default reduction is used in that state.
> For one of my test grammars, this change has doubled the table size
> and the time required to generate the tables. The bulk of the time
> (45%) was spent in fprintf.
No surprise there.
> ... I am beginning doubt if the resulting improvement is worth the cost.
Don't doubt it.
The change should not be considered an "improvement". It is a bug-fix.
(I am familiar with the code. It's a typo, substituting a "1" for a "2".)
It would be quite wrong to leave it as it is. The reason the tables
are smaller without the bug-fix is that they do not do what the
programmer coded for. If the programmer prefers smaller tables to
error-recovery, he can achieve that result by simply omiting
the error-reductions from the source code.
> I have devised a way of achieving the same effect without increasing the
> size of the tables. However, my alternative will make parsing slower.
I would be quite interested to hear the method outlined. I betcha it is
similar to the schemes discussed in this group for reconstructing
follow-sets at runtime for error-reporting. If I were thinking of
putting something like that into YACC's official distribution, I
certainly would not fail to ask this group to review the algorithm
first. To quote the great philosopher John Belushi, "It don't cost
I would not worry too much about table-size. Recently I wrote a
yacc lookalike which produces what I shall call ALR(1) tables (for
"Almost" LR). The only LR(1) sets which it mods together are those
which have no shift-actions and at most one reduce-action. (Puzzle:
why only those? ) The parser creates state-sets which are about twice
as large as those generated by a bug-fixed yacc. And it implements the
thing not with cleverly encoded tables, but with two gigantic C
switch-statements! That makes the parser very fast, but much much larger,
proportionately, than its yacc-generated counterpart. So? S'no deal,
Home! Memory is cheap.
Yes, the program takes a while to generate the output, about forty-five
seconds for a big grammar on a Sun 3/60. I haven't tried it on a Sun-4.
Maybe 10-20 seconds? What the heck. Used to be we had to wait a day
to get our printout back from the sysop. Now we kill a minute writing
to comp.compilers or something.
Yacc was devised to run its output on machines which were tiny by
today's standards. I guess there may still be machines around on which
doubling or tripling the size of yacc's tables would be significant,
but they are becoming rarer and rarer. The Sun on my desk, with 16 meg of
ram and 30 or so meg of virtual, sure doesn't care about an extra 16K.
Somebody will put forth the often quoted statistic that parsing accounts
for only 15% of the time in a typical compiler, and so go ahead and make
it slower. I vote for making the tables bigger, if that's the choice,
because there are applications for yacc in which parsing speed *is*
significant. Besides, that fix is trivial, and you know it is correct.
In any case, by all means *do* fix the bug. The way it is now is
just plain wrong.
[From djones@megatest (Dave Jones)]
Return to the
Search the comp.compilers archives again.