Re: Have I discovered something new?

"Chris F Clark" <cfc@shell01.TheWorld.com>
4 Aug 2002 11:39:57 -0400

          From comp.compilers

Related articles
[3 earlier articles]
Have I discovered something new? cfc@world.std.com (Chris F Clark) (2002-07-21)
Re: Have I discovered something new? kgw-news@stiscan.com (2002-07-21)
Re: Have I discovered something new? whopkins@alpha2.csd.uwm.edu (Mark) (2002-07-24)
Re: Have I discovered something new? steve@lurking.demon.co.uk (Steve Horne) (2002-07-25)
Re: Have I discovered something new? robert.corbett@sun.com (Robert Corbett) (2002-07-31)
Re: Have I discovered something new? whopkins@alpha2.csd.uwm.edu (Mark) (2002-07-31)
Re: Have I discovered something new? cfc@shell01.TheWorld.com (Chris F Clark) (2002-08-04)
Re: Have I discovered something new? whopkins@alpha2.csd.uwm.edu (Mark) (2002-08-04)
| List of all articles for this month |

From: "Chris F Clark" <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: 4 Aug 2002 11:39:57 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 02-07-057 02-07-079 02-07-126
Keywords: parse
Posted-Date: 04 Aug 2002 11:39:56 EDT

Robert Corbett (replying to me) wrote:
> I doubt it. Consider the grammar
>
> S -> aSa | a
>
> This grammar is unambiguous, but it features unbounded nondeterminism.


Yes, but the non-determinisim is simply recursive. One simply makes a
table that always pushes the current rule onto the stack until one
encounters the trailing-context, which in this case is the eof
token. Upon finding the eof token, one performs the reductions from
outside-in. This effectively makes the parser work from left-to-right
and then from right-to-left.


Just because we consider our basic parsing technology to be LR
(i.e. it is in the shift-reduce family), does not mean that it parses
strictly left-to-right. It parses left-to-right until it gets into
lookahead-conflicts. However, in lookahead conflicts it continues
scanning left-to-right, but it can reduce from either the left-end or
the right-end. That reduction can then allow a subsequent reduction,
again from either end.


Of course, that means our internal representation is not strictly a
"stack" (its more like a deque). As theorists know, using a deque
makes our machine a "Post machine" which is equivalent to a TM. It's
just that you cannot "program" this nachine arbitrarily. The parser
generator does the actual programming and limits the TM's that can be
constructed.


The goal being a machine which recognizes all unambiguous grammars
(and all grammars that have an unambiguous deterministic parse ala
Aho, Ullman, and Johnson's paper on Deterministic Parsing of Ambiguous
Grammars). The problem being that no algorithm can decide which
grammars are ambiguous given the entire set of cfg's. Thus, sometimes
our generator simply fails to halt. However, when it halts one knows
there is an unambiguous parse of the grammar and it has found it.


> A problem with parsing ambiguous grammars is choosing a suitable
> representation for the derivations. Consider the grammar
>
> S -> S | a


We consider this grammar an error and flag it as such. It is an
ambiguity to reach a state where two rules (S->S and S->a) have the
same trailing context. Any recursion (direct or indirect) that does
not add some non-null context on either the left or right end, causes
an ambiguity of this type. We don't detect it by detecting the
recursion, but one could.


> The language generated by the grammar can be recognized in linear
> time. The grammar can be parsed in linear time using a two-headed
> bi-directional parsing automaton if the automaton can tell when the
> two heads are positioned on the same input symbol.
>
> Determinisic grammars and grammars with bounded nondeterminism can be
> parsed in linear time. That is an old result. I learned it thirty
> years ago.


Please note, I am not claiming any new theoretical results, nor even
any practical ones at the moment--I was just replying to someone who
was asking whether they might have (and I said "maybe, given what I
know"). Our implementation is still a work-in-progress. It may
forever be, as we don't have the time to do more than tinker with it,
when we have time to do even that.


I do understand enough of how it works that I will eventually document
it in my sigplan column. However, that's not high on my list of
priorities and won't occur for another year at least.


You may still say ho-hum when it comes out. I don't believe our ideas
are "revolutionary", just simply the natural evolution of LR parsing
technology. The result is the product of adding together ideas from
numerous other people (I do *try* to read every paper (and web page)
on parsing that I can find.), plus a few serendipitous mistakes.


Sincerely, (and hopefully humbly)
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.