Re: LR(k) parser generator for k>1?

Chris F Clark <cfc@shell01.TheWorld.com>
Wed, 28 May 2008 07:34:06 -0400

From comp.compilers

Related articles
LR(k) parser generator for k>1? txchen@gmail.com (Tom) (2008-05-18)
Re: LR(k) parser generator for k>1? haberg_20080406@math.su.se (Hans Aberg) (2008-05-20)
Re: LR(k) parser generator for k>1? joevans@gmail.com (Jason Evans) (2008-05-20)
Re: LR(k) parser generator for k>1? parrt@cs.usfca.edu (parrt) (2008-05-20)
*Re: LR(k) parser generator for k>1? cfc@shell01.TheWorld.com (Chris F Clark)* (2008-05-28)**
Re: LR(k) parser generator for k>1? txchen@gmail.com (Thomas Chen) (2008-05-29)
Re: LR(k) parser generator for k>1? kamalpr@hp.com (kamal) (2008-06-03)
Re: LR(k) parser generator for k>1? cfc@shell01.TheWorld.com (Chris F Clark) (2008-06-03)
Re: LR(k) parser generator for k>1? FSet.SLB@gmail.com (Scott Burson) (2008-06-08)

| List of all articles for this month |

From:	Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups:	comp.compilers
Date:	Wed, 28 May 2008 07:34:06 -0400
Organization:	The World Public Access UNIX, Brookline, MA
References:	08-05-075
Keywords:	LR(1), parse
Posted-Date:	28 May 2008 21:20:06 EDT

Tom <txchen@gmail.com> writes:

> I have searched on Internet and the comp.compilers news group for
> LR(k) parser generator implementation where k > 1.
>
> The most important theorectical work since 1980s seem to include those
> of M. Ancona (1980s-1990s) and Terence Parr (1993). Many people share
> Terence's idea of splitting the atomic k-tuple. The LR(k)
> implementation attempts I could find so far include:
>
> - Lark of the Cocktail Toolbox. By Josef Grosch. 1995. His online
> comment said It was practical for medium size grammars only for LR(1).
> LR(k) was even more expensive.
> - LR(1). Gofer. By Bob Buckley. 1995. He said It was a long way from
> being production software though.
> - Ancona et. al. also claimed to have an implementation, but so
> further information is available.
> - In 2005 Karsten Nyblad was planning to work on a LR(k) parser
> generator. No more information available.
> - Yacc++ by Chris F Clark implemented LR(k) but it seemed to have an
> infinite loop issue.
> - Ralph Boland was working something in this direction.
> - MSTA of the COCOM toolset implemented LR(k) in 1998 and claimed to
> be efficient.
>
> It seems the only one that claims to be fully functional and efficient
> is MSTA. Anyone used it and how was it like?
>
> Would it be meaningful to implement one? I'm very interested to know.

Sorry, for taking so long to respond. I am current away on a long
trip and thus checking news infrequently. I want to respond directly
to your question of meaningfulness.

The answer depends on where you want to go from there. For example,
if you want to implement an LR(k) parser generator so that you better
understand the theory--that is certainly a meanignful exercise. You
could also start with your LR(k) model to explore different parts of
parser generation. Although the field doesn't seem that lively, there
are plenty of things left to be discovered.

It would also be meaningful to implement an LR(k) parger generator if
you had a language which that could not figure out how to write an
LR(1) gramar for, and you wanted to verify that the language was not
ambiguous. One of the "flaws" of many of the advanced parsing
techniques (e.g. GLR, PEG, predicated grammars, TAGs, etc.) is that
they don't distinguish well between ambiguous and non-ambiguous
grammars. As a result, it is possible to write a gramar that parses
some string to something other than what you intend.

It might also be meaningful to do so from a practical perspective. An
LR(k) parser should still run in linear time. That is a nice
guarantee for run-time performance analysis. Of the advanced parsing
methods, only the PEG/LL(infinity) approach also has that guarantee.

Now, for the caveat, there always is one. Unless you are doing it
purely for yourself, it isn't meaningful to invent yet another
incompatible parser generator, especially if the only difference is
that yours will be LR(k). If you are going to do this and release it
to the world, it is much better if you enhance an existing parser
generator.

The default choice for doing so is probably BISON. However, if you do
it to BISON, please try to ensure that your version gets incorporated
into the standard distribution. Over the years, there have been
several variants of BISON that have gotten started, but descended into
niche un-supported status and thus aren't widely useful.

With that in mind, I am willing to support your doing so in Yacc++.
There is now a free version, if that is important to you. Moreover,
if you get the work done (even if only partially done), I will help
you to guarantee that your code remains supported. Also, if you are
interested in the field, I am more than willing to exchange ideas with
you, particularly if they result in enhancing Yacc++. In particular,
there are several "next steps" you could do with an LR(k) parser
generator (or even just an LR(1) one), that should result in
advancements to the field that would be publishable. Finally, if we
were working together and you were looking to do an internship, I
could probably get one at Intel for you (my day job), although it
would be somewhat tangential, since my work there is not directly on
parsing at the moment, but on hardware implementation of FAs for a
different application area.

Thus, if you are interested in the project and would like to work with
Yacc++ as a base for your implementation, contact me by email and I
will see if I can't make that possible.

Best of luck,
-Chris

******************************************************************************
Chris Clark Internet: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. or: compres@world.std.com
23 Bailey Rd Web Site: http://world.std.com/~compres
Berlin, MA 01503 voice: (508) 435-5016
USA fax: (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: LR(k) parser generator for k>1?

Chris F Clark <cfc@shell01.TheWorld.com>Wed, 28 May 2008 07:34:06 -0400

Chris F Clark <cfc@shell01.TheWorld.com>
Wed, 28 May 2008 07:34:06 -0400