DFA's and LR machine (was Re: Trouble understanding...)

Carl Cerecke <cdc@maxnet.co.nz>
21 Apr 2004 00:44:31 -0400

          From comp.compilers

Related articles
Trouble understanding LL(k) vs. LALR(k) etc. zork_666@nospammail.net (Johnathan) (2004-03-11)
Re: Trouble understanding LL(k) vs. LALR(k) etc. maeder@tzi.de (Christian Maeder) (2004-03-15)
Re: Trouble understanding LL(k) vs. LALR(k) etc. cdc@maxnet.co.nz (Carl Cerecke) (2004-03-19)
Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-03-26)
Re: Trouble understanding LL(k) vs. LALR(k) etc. clint@0lsen.net (Clint Olsen) (2004-04-14)
Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-04-15)
DFA's and LR machine (was Re: Trouble understanding...) cdc@maxnet.co.nz (Carl Cerecke) (2004-04-21)
Re: DFA's and LR machine (was Re: Trouble understanding...) wyrmwif@tsoft.com (SM Ryan) (2004-04-28)
Re: DFA's and LR machine (was Re: Trouble understanding...) cdc@maxnet.co.nz (Carl Cerecke) (2004-05-08)
Re: DFA's and LR machine (was Re: Trouble understanding...) cfc@shell01.TheWorld.com (Chris F Clark) (2004-05-09)
Re: DFA's and LR machine (was Re: Trouble understanding...) wyrmwif@tsoft.com (SM Ryan) (2004-05-16)
| List of all articles for this month |

From: Carl Cerecke <cdc@maxnet.co.nz>
Newsgroups: comp.compilers
Date: 21 Apr 2004 00:44:31 -0400
Organization: Compilers Central
References: 04-03-042 04-03-057 04-03-072 04-03-091 04-04-029 04-04-049
Keywords: parse
Posted-Date: 21 Apr 2004 00:44:31 EDT

I have another possible way of combining DFAs with an LR parsing
machine. I came up with it while finishing my thesis, but never had time
to fully investigate it.


The core idea is to translate the DPDA generated by, say, LALR(1) into
a DFA-with-counters. How? A stack in the DPDA completely describes the
state of the parser. If we enumerate all possible stacks, then each
possible stack becomes a state in the DFA. Unfortunately, of course,
stacks may be infinitely long. We can get around that by noticing that
longer stacks will always have repeated parts - where loops appear in
the DPDA. Instead of repeating these parts in the stack, we can just
have a counter for each possible repeated part. Now our set of
possible stacks is finite - it consists of all stacks that contain no
repeating parts. Each possible non-repeating stack is now a node in
the DFA. A shift-edge in the DPDA becomes an edge in the DFA that
might optionally increment a repeat-counter. A reduction-edge becomes
one or more edges in the DFA, depending on the repeat-counters, and
the edge may have a decrement-counter operation attached to
it. Regular expressions would fit neatly into this model, as they are
simply DFAs with no counters.


My hunch is that this transformation would work for a large class of LR
grammars, but possibly fail on some bizarre and unusual constructs (such
as, perhaps, A -> '[' '[' A ']' ']' ']' ']' ). I think the class of
languages on which it would work would include all known (sensible)
programming languages for which an LR grammar exists. The DFA is likely
to be large, but may be amenable to compression. Another good point is
that, if the grammar contains ambiguities, an NFA is generated, from
which a DFA can then be mechanically derived. Although I don't know if
the presence of counters makes that process more difficult.


I've been meaning to code up an example, but haven't had the time. I
would be interested in comments on the idea though.


Cheers,
Carl Cerecke


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.