Mon, 10 Nov 2008 17:54:35 +0100

Related articles |
---|

[2 earlier articles] |

Re: How test if language is LL(k) or LR(k) ? sh006d3592@blueyonder.co.uk (Stephen Horne) (2008-10-29) |

Re: How test if language is LL(k) or LR(k) ? rpboland@gmail.com (Ralph Boland) (2008-10-31) |

Re: How test if language is LL(k) or LR(k) ? cfc@shell01.TheWorld.com (Chris F Clark) (2008-11-06) |

Re: How test if language is LL(k) or LR(k) ? torbenm@pc-003.diku.dk (2008-11-07) |

Re: How test if language is LL(k) or LR(k) ? ulf.schwekendiek@googlemail.com (ulf.schwekendiek@googlemail.com) (2008-11-07) |

Re: How test if language is LL(k) or LR(k) ? sh006d3592@blueyonder.co.uk (Stephen Horne) (2008-11-07) |

Re: How test if language is LL(k) or LR(k) ? torbenm@pc-003.diku.dk (2008-11-10) |

Re: How test if language is LL(k) or LR(k) ? cfc@shell01.TheWorld.com (Chris F Clark) (2008-11-10) |

Re: How test if language is LL(k) or LR(k) ? sh006d3592@blueyonder.co.uk (Stephen Horne) (2008-11-12) |

From: | torbenm@pc-003.diku.dk (Torben =?iso-8859-1?Q?=C6gidius?= Mogensen) |

Newsgroups: | comp.compilers |

Date: | Mon, 10 Nov 2008 17:54:35 +0100 |

Organization: | Department of Computer Science, University of Copenhagen |

References: | 08-10-044 08-10-055 08-11-033 08-11-035 |

Keywords: | parse, LR(1) |

Posted-Date: | 10 Nov 2008 15:25:03 EST |

Stephen Horne <sh006d3592@blueyonder.co.uk> writes:

*> On Fri, 07 Nov 2008 14:27:18 +0100, torbenm@pc-003.diku.dk (Torben AEgidius Mogensen) wrote:*

*>*

*>>Stephen Horne <sh006d3592@blueyonder.co.uk> writes:*

*>>*

*>>> For LR, *almost* all context-free grammars are LR(k < infinity), so if*

*>>> you can prove that the grammar is context-free you are almost there.*

*>>*

*>>That's not very useful, and not even true. There are many useful*

*>>grammars that are not LR(k) for any k and even useful languages for*

*>>which there doesn't exits LR(k) grammars for any k.*

*>*

*> Context-free grammars? And for any finite k?*

Indeed. The language of all even palindromic sequences over the

alphabet {a,b} is not LR(k) for any k. Note that this has an

unambiguous grammar:

P -> aPa

P -> bPb

P ->

*> My understanding is that you'd need an infinite input. While there are*

*> certainly applications where unbounded inputs occur, a genuinely*

*> infinite input can never be completely parsed anyway.*

It is true that for any particular input, you need only finite

lookahead, but for a grammar to be LR(k), the same k can be used for

all inputs.

*> I'm not saying I'm right here, as I'm certain you know more than me,*

*> but I'd like to see an example.*

*>*

*>>> You can always construct an LR(1) - or even LR(0) - state model to*

*>>> parse a context-free grammar, but if the grammar is not LR(1) that*

*>>> state model will have shift/reduce or reduce/reduce conflicts.*

*>>>*

*>>> Almost all such conflicts can be resolved finitely. This is the logic*

*>>> behind Tomita and other generalised LR parsing algorithms. The basic*

*>>> principle is that conflicts are resolved by checking all possibilities*

*>>> - the Tomita approach being to check them concurrently using stack*

*>>> duplication, whereas another common approach is to use backtracking.*

*>>*

*>>This does, however, not give you any k, since the amount you have to*

*>>read ahead before backtracking may depend on the input.*

*>*

*> My claim is that almost all conflicts can be resolved finitely.*

If you have an ambiguous grammar, this is certainly not true. But if

the grammar is unambiguous, there is by definition only one possible

parse tree for a string, and since there exist algorithms for finding

this, you can say that you resolve the conflict by finite lookahead.

But, again, this lookahead is input-dependent.

*> I never claimed that this tells me the value of k. A method for*

*> determining k was vaguely described later in the post.*

Vaguely indeed, as in wrong.

*> Of course formally I'm wrong, because I'm counting in terms of what is*

*> useful rather than the far more numerous grammars that aren't.*

How do you determine if a grammar is useful?

*> Anyway, the main limitation on finite resolution is that the input*

*> itself must be finite. That's no cheat since any input that can be*

*> parsed completely must be finite. Obviously, formally, its a*

*> limitation - but if you're doing state-so-far parsing of infinite*

*> input streams, large lookaheads (often any lookaheads) are*

*> inappropriate anyway, so it didn't seem relevant.*

*>*

*> I probably should have mentioned this, of course.*

I never talked about infinite input, so this is irrelevant.

*>>> I have designed an variant of GLR (though I have never implemented it)*

*>>> which should theoretically give constant-time parsing of any context*

*>>> free grammar*

*>>*

*>>I assume you mean "linear-time", as constant-time parsing sounds*

*>>implausible.*

*>*

*> Yes - sorry - I meant constant time per token of input. For some*

*> variants (depending on how the table is managed and evaluated) that's*

*> amortised constant time per token of input.*

*>*

*>> But even linear-time parsing for any context-free*

*>>grammar sounds suspect and would be a very big result (if you can*

*>>prove it).*

*>*

*> 1. As I said, there's a pretty big proviso in there, related to the*

*> halting problem. The only improvement in the power of this*

*> relative to backtracking/Tomita is that it can detect problems for*

*> specific inputs at run-time.*

*>*

*> Actually, there is a proviso to that proviso. My method doesn't*

*> need reduce sizes to be encoded in the state model - they are*

*> determined at run-time by the table. This makes it easy to support*

*> EBNF definitions for nonterminals. There's no extra power in terms*

*> of the set-of-strings definition of a grammar, but it's a little*

*> more flexible in how you get to define the grammar.*

I just don't understand this description. Can you write it as

pseudo-code?

*> 2. As I said, at best I re-invented it ten years after the fact.*

*> Formal papers have already been published for tabular LR parsing.*

Indeed. But while they claim linear-time parsing for tables without

conflicts, there is no claim for tables with conflicts.

*> I also suggest some tricks such as a kind of sliding window for*

*> the table for handling unbounded inputs, but that leads to further*

*> provisos. Truth told, even ignoring the cycle issues, you can*

*> only ensure linear time parsing of any CF grammar using my method*

*> if you know the full input up-front, so that you can use a*

*> preallocated table.*

Knowing the full input up-front is no problem, but if you make a table

specifically to this input, you must show that this can be constructed

in linear time and that subsequent parsing using this table is also in

linear time.

*> That said, I believe that a proof that general CF parsing is*

*> linear time was described here some years back. I started a thread*

*> in 2002 relating to linear time parsing (my idea back then was*

*> broken), and one of the replies referred to Kolomogorov complexity*

*> theory. 6 years later, and I still haven't done the reading to*

*> understand that.*

I believe linear-time CF parsing is still an open problem.

*> 3. A lot of the costs have big constant factors. The cycle checks for*

*> a start. *Very* unlikely to be mainstream. Potentially useful for*

*> niche applications, I suppose, but even then I have my doubts.*

For just proving linear-time, big constant factors don't matter - as

long as they are constants independent of the input size.

*> My notes are in a 12-page approx. 340K OpenDocument Text file. The*

*> real explanation of the method takes about 3 pages, the rest is on*

*> variations and other general notes.*

If you really have an algorithm for linear-time CF parsing, send a

paper to a journal to get it reviewed. If you get it past the

reviewers, I might be persuaded to read your article. There are just

too many "I have this wonderful solution to (open problem), which I

have described in (long and impenetrable text)" claims out there for

me to want to read one more.

Torben

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.