9 May 2004 22:27:07 -0400

Related articles |
---|

[3 earlier articles] |

Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-03-26) |

Re: Trouble understanding LL(k) vs. LALR(k) etc. clint@0lsen.net (Clint Olsen) (2004-04-14) |

Re: Trouble understanding LL(k) vs. LALR(k) etc. cfc@shell01.TheWorld.com (Chris F Clark) (2004-04-15) |

DFA's and LR machine (was Re: Trouble understanding...) cdc@maxnet.co.nz (Carl Cerecke) (2004-04-21) |

Re: DFA's and LR machine (was Re: Trouble understanding...) wyrmwif@tsoft.com (SM Ryan) (2004-04-28) |

Re: DFA's and LR machine (was Re: Trouble understanding...) cdc@maxnet.co.nz (Carl Cerecke) (2004-05-08) |

Re: DFA's and LR machine (was Re: Trouble understanding...) cfc@shell01.TheWorld.com (Chris F Clark) (2004-05-09) |

Re: DFA's and LR machine (was Re: Trouble understanding...) wyrmwif@tsoft.com (SM Ryan) (2004-05-16) |

From: | Chris F Clark <cfc@shell01.TheWorld.com> |

Newsgroups: | comp.compilers |

Date: | 9 May 2004 22:27:07 -0400 |

Organization: | The World Public Access UNIX, Brookline, MA |

References: | 04-03-042 04-03-057 04-03-072 04-03-091 04-04-029 04-04-049 04-04-058 04-05-030 |

Keywords: | parse |

Posted-Date: | 09 May 2004 22:27:07 EDT |

Carl Cereke wrote in a follow up to a question he asked about parsing

with dfa's + counters:

*> Ok. Had I engaged my brain before my fingers, I would have remembered*

*> that, for grammars that include constructs like:*

*>*

*> S -> '(' S ')' | '[' S ']' | z*

Actually, that is not a problem, you simply at each step multiply your

counter by 3 and add 1 for [ or 2 for (.

0 -> z

1 -> ( z )

2 -> [ z ]

3 unused

4 -> (( z ))

5 -> ([ z ])

6 unused

7 -> [( z ])

8 -> [[ z ]]

9-12 unused

13 -> ((( z )))

...

One could find an even better encoding quite easily. Of course, all

I've done is translate the stack into integer operations, but that's

the idea behind counters anyway. I'm sure there some nice proof by

Church, Turing, Goedel, Post or someone that shows that one can do

this completely generally--my automata theory is quite rusty these

days.

In any case, so the answer is one can implement any LR parser, by a

DFA plus counters (if you allow sufficient integer arithmetic). With

a little more work I think one can even do an Earley style parser that

way. In fact, if a DFA + infinite (unbounded) integer arithmetic were

not Turing Complete I would be extremely surprised.

------------------------------------------------------------------------

Taking your thought another direction, if you shift by 2

(i.e. multiply by 4) rather than multiply by 3, it becomes pretty

clear that the integer represents a stack with 2 bits per entry. This

reminds me of a wonderful old paper in the British Computing Society's

Journal on using parsing to compress. The author worked out a way to

encode the parsing decisions of both LL and LR parsers as integers

(bit vectors), using the minimal space. The idea is quite obvious in

retrospective, at each point in the parse, you have a series of

decisions to make based on the input, and you encode those decisions

according to their probabilities, very much like arithmetic encoding

(actually, I think it is exactly like arithmetic encoding, although

the author may have used a simpler scheme like Huffman encoding).

Note, your original idea of using a simple counter to handle pure

left or right recursion is equivalent to using run-length encoding for

that part of the grammar. However, I think if you do the arithmetic

encoding right, you will be at most a couple bits longer than the

run-length encoding for those cases. (My compression theory is no

better than my automata theory, so you'ld be better off asking someone

else.)

Hope this helps,

-Chris

*****************************************************************************

Chris Clark Internet : compres@world.std.com

Compiler Resources, Inc. Web Site : http://world.std.com/~compres

23 Bailey Rd voice : (508) 435-5016

Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.