8 May 2004 21:11:04 -0400

Related articles |
---|

[2 earlier articles] |

Re: Trouble understanding LL(k) vs. LALR(k) etc. cdc@maxnet.co.nz (Carl Cerecke) (2004-03-19) |

Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-03-26) |

Re: Trouble understanding LL(k) vs. LALR(k) etc. clint@0lsen.net (Clint Olsen) (2004-04-14) |

Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-04-15) |

DFA's and LR machine (was Re: Trouble understanding...) cdc@maxnet.co.nz (Carl Cerecke) (2004-04-21) |

Re: DFA's and LR machine (was Re: Trouble understanding...) wyrmwif@tsoft.com (SM Ryan) (2004-04-28) |

Re: DFA's and LR machine (was Re: Trouble understanding...) cdc@maxnet.co.nz (Carl Cerecke) (2004-05-08) |

Re: DFA's and LR machine (was Re: Trouble understanding...) cfc@shell01.TheWorld.com (Chris F Clark) (2004-05-09) |

Re: DFA's and LR machine (was Re: Trouble understanding...) wyrmwif@tsoft.com (SM Ryan) (2004-05-16) |

From: | Carl Cerecke <cdc@maxnet.co.nz> |

Newsgroups: | comp.compilers |

Date: | 8 May 2004 21:11:04 -0400 |

Organization: | TelstraClear |

References: | 04-03-042 04-03-057 04-03-072 04-03-091 04-04-029 04-04-049 04-04-058 |

Keywords: | parse |

Posted-Date: | 08 May 2004 21:11:04 EDT |

Carl Cerecke wrote:

*> The core idea is to translate the DPDA generated by, say, LALR(1) into*

*> a DFA-with-counters. How? A stack in the DPDA completely describes the*

*> state of the parser. If we enumerate all possible stacks, then each*

*> possible stack becomes a state in the DFA. Unfortunately, of course,*

*> stacks may be infinitely long. We can get around that by noticing that*

*> longer stacks will always have repeated parts - where loops appear in*

*> the DPDA. Instead of repeating these parts in the stack, we can just*

*> have a counter for each possible repeated part. Now our set of*

*> possible stacks is finite - it consists of all stacks that contain no*

*> repeating parts. Each possible non-repeating stack is now a node in*

*> the DFA. A shift-edge in the DPDA becomes an edge in the DFA that*

*> might optionally increment a repeat-counter. A reduction-edge becomes*

*> one or more edges in the DFA, depending on the repeat-counters, and*

*> the edge may have a decrement-counter operation attached to*

*> it. Regular expressions would fit neatly into this model, as they are*

*> simply DFAs with no counters.*

OK. Had I engaged my brain before my fingers, I would have remembered

that, for grammars that include constructs like:

S -> '(' S ')' | '[' S ']' | z

a stack is *required*, because an entire sequence of ( and [ must be

remembered in order to see the matching token on the other side of the

z. If we only use counters to count how many we've seen, then we will

know how many of ) and ] we expect, but we won't know in what order

they are required. This could be handled with backtracking, or parsing

multiple possibilities at once. Pretty ugly, and likely to be

inefficient, of course. Lookahead can be used, but a finite lookahead

will end up accepting strings not in the language. i.e. LA of 1 would,

for the prefix ([([(z) accept any combination of two ) and two ].

So, a stack is basically required. But what is minimally required to

go onto the stack? I think a stack is required only for loops arising

from mutually nestable middle-recursive rules: when you encounter one

of these loops, you push an appropriate symbol, and at the appropriate

reduction you pop off the symbol and use it to decide where to go. All

other loops can increment a simple counter, and the appropriate

reduction decrements it (Or, for suitably masochistic grammars, a

multiple decrement may be required: S->aSb|aaSc|a)

And so, it should be possible to convert nearly[1] any LR parser to a

mostly-DFA with some counters and the odd push and pop. Integrating

regular expressions into that should be simple. (See! There is a point)

Cheers,

Carl.

[1] Anybody who creates ugliness like:

S-> acSc | acacSf | abSd | z

deserves a full-blown LR machine, I think.

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.