21 Apr 2004 00:44:31 -0400

From: | Carl Cerecke <cdc@maxnet.co.nz> |

Newsgroups: | comp.compilers |

Date: | 21 Apr 2004 00:44:31 -0400 |

Organization: | Compilers Central |

References: | 04-03-042 04-03-057 04-03-072 04-03-091 04-04-029 04-04-049 |

Keywords: | parse |

Posted-Date: | 21 Apr 2004 00:44:31 EDT |

I have another possible way of combining DFAs with an LR parsing

machine. I came up with it while finishing my thesis, but never had time

to fully investigate it.

The core idea is to translate the DPDA generated by, say, LALR(1) into

a DFA-with-counters. How? A stack in the DPDA completely describes the

state of the parser. If we enumerate all possible stacks, then each

possible stack becomes a state in the DFA. Unfortunately, of course,

stacks may be infinitely long. We can get around that by noticing that

longer stacks will always have repeated parts - where loops appear in

the DPDA. Instead of repeating these parts in the stack, we can just

have a counter for each possible repeated part. Now our set of

possible stacks is finite - it consists of all stacks that contain no

repeating parts. Each possible non-repeating stack is now a node in

the DFA. A shift-edge in the DPDA becomes an edge in the DFA that

might optionally increment a repeat-counter. A reduction-edge becomes

one or more edges in the DFA, depending on the repeat-counters, and

the edge may have a decrement-counter operation attached to

it. Regular expressions would fit neatly into this model, as they are

simply DFAs with no counters.

My hunch is that this transformation would work for a large class of LR

grammars, but possibly fail on some bizarre and unusual constructs (such

as, perhaps, A -> '[' '[' A ']' ']' ']' ']' ). I think the class of

languages on which it would work would include all known (sensible)

programming languages for which an LR grammar exists. The DFA is likely

to be large, but may be amenable to compression. Another good point is

that, if the grammar contains ambiguities, an NFA is generated, from

which a DFA can then be mechanically derived. Although I don't know if

the presence of counters makes that process more difficult.

I've been meaning to code up an example, but haven't had the time. I

would be interested in comments on the idea though.

Cheers,

Carl Cerecke

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.