Re: Is that difficult to write a Shift-Reduce parser

Chris F Clark <cfc@shell01.TheWorld.com>
Sun, 02 May 2010 15:07:40 -0400

          From comp.compilers

Related articles
Is that difficult to write a Shift-Reduce parser kuangpma@gmail.com (kuangpma) (2010-04-29)
Re: Is that difficult to write a Shift-Reduce parser kym@sdf.lonestar.org (russell kym horsell) (2010-05-01)
Is that difficult to write a Shift-Reduce parser chakaram@auth.gr (Chariton Karamitas) (2010-05-02)
Re: Is that difficult to write a Shift-Reduce parser rpw3@rpw3.org (2010-05-01)
Re: Is that difficult to write a Shift-Reduce parser cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-02)
Re: Is that difficult to write a Shift-Reduce parser dot@dotat.at (Tony Finch) (2010-05-04)
Re: Is that difficult to write a Shift-Reduce parser sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-05-07)
Re: Is that difficult to write a Shift-Reduce parser rpw3@rpw3.org (2010-05-09)
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: Sun, 02 May 2010 15:07:40 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 10-04-072 10-05-003
Keywords: parse, LR(1)
Posted-Date: 02 May 2010 22:35:14 EDT

kuangpma <kuangpma@gmail.com> wrote:


>> I am learning compiler theory, several books teach how to write (even
>> with source code) a Recursive Decent parser but not Shift Reduce
>> parser, only tell the theroy behind it. Particularly the Louden book,
>> it says that handcrafting a Shift Reduce parser is very difficult so
>> it suggests to use tools such like yacc to generate the parser.


russell kym horsell <kym@sdf.lonestar.org> writes:
> In the old days people (yes, actual people) used to create the tables for
> S/R parsers by hand. So it can't be hard, can it?


There is a story that Steve Johnson tells about how yacc first got
written. This is 2nd hand, so I may get it slightly wrong. In any case,


Steve was working on an early C compiler at ATT and trying to get the
parser written. He ended up talking to Al Aho about it, who started
to teach him the LR method developed by Knuth. Anyway, Aho set out to
write by hand the LR tables for C. He ended up making these huge
sheets filled with tables. Steve realized there had to be an easier
way and so he got Aho to explain how he was creating the tables.
Steve realized that it was an easy enough algorithm that he could
program into the machine. From, this yacc sprang up.


<end of story>


As, John our moderator said, for a non-trivial language creating the
LR tables is often tedious. Even to some extent so is creating a
recursive descent parser. Moreover, past a fairly minimal complexity
it isn't that educational. It is much better to learn how to do so on
a small language, but quick put that behind you and concentrate on
writing grammars and using a tool to generate the parser in the style
you like.


Digression: Herein lies a partial explanation while writing recursive
descent parsers is less tedious compared to grammar writing than
writing LR tables. The characteristics to make a grammar suitable for
LL are very similar to the characteristics one needs to write the RD
parser (by hand). Thus, the tedious parts, getting the precedence
hierarchy right, making certain you have left-factored everything that
needs it, etc. are present in the grammar writing stage. Once, you
have an LL grammar, it is essentially a trivial translation to the RD
parser.


With an LR grammar (especially one augmented with precedence
directives), you don't need to do the tedious part at the grammar
level, except for resolving conflicts if your grammar has them. The
tedious part is all in translating the grammar to the tables. So,
less tedium up-front, but more on the back end.


Still, in either case, LL or LR, a tool does relieve some of the
tedium and more importantly can apply checks to ensure that you have a
valid grammar and haven't violated the requirements of the parsing
method. To me that last fact is the most important argument against
hand-coded RD, as it allows one to make un-caught mistakes that only
appear when one uses the grammar and it mis-parses some input text.


Anyway, to answer your question, doing LR tables by hand isn't that
hard. I often do it, or doing subset construction to create a DFA
from an NFA, which is essentially the same algorithm, on a whiteboard
when we are discussing something we are building. However, I never do
it for production code. We always have the algorithm encoded in
software and feed it a relevant "grammar" (or set of regular
expressions), to get the result we desire. Again, much of the point
of that is not only do we know we get correct tables (or whatever
representation we want the software to generate), but we get warnings
and errors when we feed the tool something that violates the key
assumptions.


So, for your learning, go ahead and learn to generate the LR tables
for a simple language. You can even practice several times if you
want to be certain you understand. However, after doing that switch
to using a tool.


Just my advice,
-Chris


******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
[The story may well be true, but there were lots of other parser
generators around in the 1970s, hence the "ya" part of yacc. The
reason yacc is still around and the others are dead is that it let you
put your action code next to the grammar rules, generated the required
switch statement, and plugged in the references to the token stack.
Before that, you fed your grammar to a parser generator and got back a
list of rule numbers and a table of state information, leaving it up
to you to glue everything together, and making small changes painful
if they changed the rule numbers. -John]





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.