Re: Have I discovered something new?

"Mark" <whopkins@alpha2.csd.uwm.edu>
4 Aug 2002 11:48:37 -0400

          From comp.compilers

Related articles
[4 earlier articles]
Re: Have I discovered something new? kgw-news@stiscan.com (2002-07-21)
Re: Have I discovered something new? whopkins@alpha2.csd.uwm.edu (Mark) (2002-07-24)
Re: Have I discovered something new? steve@lurking.demon.co.uk (Steve Horne) (2002-07-25)
Re: Have I discovered something new? robert.corbett@sun.com (Robert Corbett) (2002-07-31)
Re: Have I discovered something new? whopkins@alpha2.csd.uwm.edu (Mark) (2002-07-31)
Re: Have I discovered something new? cfc@shell01.TheWorld.com (Chris F Clark) (2002-08-04)
Re: Have I discovered something new? whopkins@alpha2.csd.uwm.edu (Mark) (2002-08-04)
| List of all articles for this month |

From: "Mark" <whopkins@alpha2.csd.uwm.edu>
Newsgroups: comp.compilers
Date: 4 Aug 2002 11:48:37 -0400
Organization: University of Wisconsin - Milwaukee, Computing Services Division
References: 02-07-057 02-07-079 02-07-126
Keywords: parse
Posted-Date: 04 Aug 2002 11:48:37 EDT

"Robert Corbett" <robert.corbett@sun.com> writes:
>> I believe for non-ambiguous grammar the process of doing so always
>> terminates.
>I doubt it. Consider the grammar
> S -> aSa | a
>This grammar is unambiguous, but it features unbounded nondeterminism.


This is a good example to plug into the algebraic formalism I described in
previous articles. Through it you can see how the claim above can be
fulfilled.


First extended "canonically" to the bottom-up SSDTG for the SSDT L you'd have
the following (listed in algebraic form):
                              L >= Sz
S >= aSax + ay
with
input alphabet X = {a}
output alphabet Y = {x,y,z}
and 'standard' interpretation:
x = "reduce [S] <- [aSa]"
y = "reduce [S] <- [a]"
z = "reduce root <- [S]".


Defining SF as the least solution to:
SF >= |0>z + |1>ax SF
and SS by
SS = S SF
you get:
<0| SS = S <0| SF <= Sz
<1| SS = S <1| SF <= Sax SF
so that
                          L >= SZ >= <0| SS
SS = S SF >= aSax SF + ay SF >= a <1| SS + ay SF
thus yielding the following right-linear system:
L >= <0| A
A >= a <1| A + ay B
B >= |0>z + |1>ax B
whose least solution is:
B = (|1>ax)* |0>z
A = (a<1|)* ay (|1>ax)* |0>z
and
L = <0| (a<1|)* ay (|1>ax)* |0>z.


This is the context-free expression that represents the
SSDT in question:
        L = { vw in X* x Y*: v in X*, w in Y*, w encodes a parse tree for v }


The relevant question at hand is to determine what is:
L[v] = { w in Y*: vw is in L }, for any v in X*
which provides a representation of the set of parses for the given
word v.


This, in fact, can be read directly off the expression itself, since:


          L = <0| (a<1|)* ay (|1>az)* |0>z
              = sum (<0| (a<1|)^m ay (|1>az)^n |0>z: m, n >= 0)
              = sum (a^{m+1+n} <0|<1|^m y (|1>x)^n |0>z: m, n >= 0)
              = sum (a^{m+1+n} <0| <1|^m |1>^n |0> y x^n z: m, n >= 0)


So that the expression L[v] corresponding to v = a^p is just:


              L[a^p] = sum (<0| <1|^m |1>^n |0> y x^n z: m, n >= 0; m+n+1 = p)


The operator expression <0| <1|^m |1>^n |0> is what the "precomputed
stack expression" would correspond to; under the interpretation


<0| = "create new stack"
|0> = "test for empty stack and clear stack"
<i| = "push i onto stack", i = 1, 2, 3, ...
|i> = "pop and test for equality to i", i = 1, 2, 3, ...


In the application at hand, this can be further optimized since


<0| <1|^m |1>^n |0> = 1 if m = n, 0 else
so that
L[a^{2n+1}] = y x^n z; L[a^{2n}] = 0


>A problem with parsing ambiguous grammars is choosing a suitable
>representation for the derivations. Consider the grammar
> S -> S | a


An example even worse than this was actually treated in my previous article.


>The single string generated by this grammar has an infinite number of
>derivations.


In fact, the derivation words form a CONTEXT-FREE language in their
own right (which, to reiterate the point made previously, is why any
means of representing parse-sets must amount to a theory and notation
for context-free expressions).


If you write the canonical extension to an SSDT as:


L >= Sz; S >= Sx + ay
then this reduces to the context-free (actually, regular) expression


                                              L = ay x* z
so that
L[a] = y x* z.


Your example's not the worst possible since the parse set is ONLY a regular
language, albeit one that produces an infinite of parse words. The worst
example is if you add in the extra items


S -> SS; and S -> 1 (empty word).


Then we're talking business.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.