Re: inling/tail recursion, recursive decent parser

Chris F Clark <cfc@shell01.TheWorld.com>
2 Jul 2005 20:25:54 -0400

          From comp.compilers

Related articles
inling/tail recursion, recursive decent parser spam@disney.com (Lefteris Keftedopoulos) (2005-06-21)
Re: inling/tail recursion, recursive decent parser henry@spsystems.net (2005-06-23)
Re: inling/tail recursion, recursive decent parser diablovision@yahoo.com (2005-06-30)
Re: inling/tail recursion, recursive decent parser ronny@cs.ru.nl (Ronny Wichers Schreur) (2005-07-02)
Re: inling/tail recursion, recursive decent parser cfc@shell01.TheWorld.com (Chris F Clark) (2005-07-02)
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: 2 Jul 2005 20:25:54 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 05-06-100 05-06-144
Keywords: code, parse
Posted-Date: 02 Jul 2005 20:25:54 EDT

Diablovision raised some good points regarding tail recursion and
inlining and recursive descent parsing. However, the issue might go
farther than that. If the parser keeps the entire contents of the
current production visible, using iteration is not a panacea.
Consider the fragment below.


grammar: stmt*;
                { for (int i = 0; i < length($$); ++i) print($i); }


The idea is that we loop over the entire rule in the reduce code and
print all the statements. This can become a key limiting point in
your parser, depending on the implementation. Since the parser, if it
keeps track of all the entries in a rule, it is likely to do so in one
object and that object may be limited in size. I don't know Java's
rules here, but I've definitely worked in environments where a single
object was limited to 64K bytes, which if you have 4 byte pointers,
turns out to be too small in some cases. It's a fairly rare problem,
but it definitely does occur (and more often than one would naively
expect).


Now, in Yacc++ we allow the user to solve this problem, by allowing
them to write, something close to:


grammar: (stmt { print($last); free($last); })*;


In this case, the grammar object doesn't grow, as we throw the stmt
objects away upon processing them. Interestingly, if one writes the
recursive solution, the default semantics one is likely to write are
process and throw away. This is especially true if one doesn't have
an automatic AST tree builder as part of the system--and if one does
have a tree builder the recursive solution is like to use "cons cells"
(that is a bunch of small objects, one for each recursive call), which
can increase the total memory footprint, but keep one from creating
too large of single objects.


It is worth mentioning that alternately one can implement "big"
objects, where if the object grows to more than the implementation
limit, the implementation splits it up into smaller objects. That is
probably not a trivial exercise.


In any case, it is often useful to think through all the ramifications
when one is proposing an extension. Tail recursion elimination may be
nice. However, in this case it just moves the weakness to another
point in the system.


Historical note: we crossed this bridge sometime in the early 90's
with Yacc++. From the inception of our parser generator, we had
encouraged iteration over recursion. We had also implemented the
feature which allowed one to easily throw away stack entries, mostly
for getting rid of syntactic sugar. However, when a client showed us
this problem, we realized that processing and discarding the entries
of the top-level rule was important to do also, so we documented that
as a recommendation. Of course, the recursive solution also works
(and we don't do tail-recursion removal specifically so that it will
work), but we didn't want to recommend recursion simply as a patch for
this one case. It seems inconsistent.


Hope this helps,
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.