15 Feb 1999 23:23:46 -0500

Related articles |
---|

Minding my Ps and Qs peter.r.wilson@boeing.com (Peter R. Wilson) (1999-02-15) |

Re: Minding my Ps and Qs torbenm@diku.dk (Torben Mogensen) (1999-02-16) |

Re: Minding my Ps and Qs paul.janssens@skynet.be (JPA) (1999-02-21) |

From: | "Peter R. Wilson" <peter.r.wilson@boeing.com> |

Newsgroups: | comp.compilers |

Date: | 15 Feb 1999 23:23:46 -0500 |

Organization: | The Boeing Company |

Keywords: | parse |

I am trying to massage an ambiguous grammar for the EXPRESS-X

language into a form suitable for LL(1) parsing. I use Wirth's syntax

notation (WSN) (i.e., {} means zero or more repetitions and [] means

zero or one repetitions). After some work I got stuck on a term in the

grammar that was of the form:

1 t = { P | Q } P .

That is, sentences consisting of any number of P and/or Q, followed by

P. This is LL(oo) as written. In trying to massage this I made the

assumption that an equivalent production is one that could be used to

generate an identical set of sentences (I have a tool that does sentence

generation from a WSN grammar). After some fiddling around I came up

with what turned out to be a near equivalent, namely

2 t = P | Q { Q } P | t P .

This generates strings of any number of Ps and/or Qs, followed by one or

two Ps. Production 2 is directly left recursive, but the compiler books

I have read say that you can eliminate left recursion, as per the

following (if I have written it correctly in WSN).

Given a left recursive production

a = b | a c .

Then an equivalent non left recursive grammar is

a = b aa .

aa = [ c aa ] .

So, I applied this to (2) to produce the equivalent:

3 t = ( P | Q { Q } P ) tt .

4 tt = [ P tt ] .

BUT this does not produce the same set of sentences as does (2)

(basically once a sequence of Qs followed by a P has been generated then

there will be no more Qs --- production (4) generates a string of Ps).

On the other hand, using (5) instead of (4) appears to make (3)

equivalent to (2), where

5 tt = [ P t ] .

I must have gone wrong somewhere, but I can't figure out where. Can

anyone explain it to me?

By the way, I think that the equivalent LL(1) production to (1) is:

6 t = ( P | Q { Q } P ) [ t ] .

Anyone willing to confirm this?

I haven't got around yet to writing an actual parser, but have put

the grammar through a tool that checks if a WSN grammar is LL(1). As I

coded both of the tools, they might be in error somewhere.

Thanks for any help.

Peter W.

peter.r.wilson@boeing.com

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.