|How test if language is LL(k) or LR(k) ? email@example.com (Borneq) (2008-10-23)|
|Re: How test if language is LL(k) or LR(k) ? cfc@shell01.TheWorld.com (Chris F Clark) (2008-10-28)|
|Re: How test if language is LL(k) or LR(k) ? firstname.lastname@example.org (Stephen Horne) (2008-10-29)|
|Re: How test if language is LL(k) or LR(k) ? email@example.com (Ralph Boland) (2008-10-31)|
|Re: How test if language is LL(k) or LR(k) ? cfc@shell01.TheWorld.com (Chris F Clark) (2008-11-06)|
|Re: How test if language is LL(k) or LR(k) ? firstname.lastname@example.org (2008-11-07)|
|Re: How test if language is LL(k) or LR(k) ? email@example.com (firstname.lastname@example.org) (2008-11-07)|
|Re: How test if language is LL(k) or LR(k) ? email@example.com (Stephen Horne) (2008-11-07)|
|Re: How test if language is LL(k) or LR(k) ? firstname.lastname@example.org (2008-11-10)|
|Re: How test if language is LL(k) or LR(k) ? cfc@shell01.TheWorld.com (Chris F Clark) (2008-11-10)|
|Re: How test if language is LL(k) or LR(k) ? email@example.com (Stephen Horne) (2008-11-12)|
|From:||Stephen Horne <firstname.lastname@example.org>|
|Date:||Fri, 07 Nov 2008 21:35:30 +0000|
|References:||08-10-044 08-10-055 08-11-033|
|Keywords:||parse, theory, comment|
|Posted-Date:||07 Nov 2008 18:58:40 EST|
On Fri, 07 Nov 2008 14:27:18 +0100, email@example.com (Torben AEgidius Mogensen) wrote:
>Stephen Horne <firstname.lastname@example.org> writes:
>> For LR, *almost* all context-free grammars are LR(k < infinity), so if
>> you can prove that the grammar is context-free you are almost there.
>That's not very useful, and not even true. There are many useful
>grammars that are not LR(k) for any k and even useful languages for
>which there doesn't exits LR(k) grammars for any k.
Context-free grammars? And for any finite k?
My understanding is that you'd need an infinite input. While there are
certainly applications where unbounded inputs occur, a genuinely
infinite input can never be completely parsed anyway.
I'm not saying I'm right here, as I'm certain you know more than me,
but I'd like to see an example.
>> You can always construct an LR(1) - or even LR(0) - state model to
>> parse a context-free grammar, but if the grammar is not LR(1) that
>> state model will have shift/reduce or reduce/reduce conflicts.
>> Almost all such conflicts can be resolved finitely. This is the logic
>> behind Tomita and other generalised LR parsing algorithms. The basic
>> principle is that conflicts are resolved by checking all possibilities
>> - the Tomita approach being to check them concurrently using stack
>> duplication, whereas another common approach is to use backtracking.
>This does, however, not give you any k, since the amount you have to
>read ahead before backtracking may depend on the input.
My claim is that almost all conflicts can be resolved finitely. I
never claimed that this tells me the value of k. A method for
determining k was vaguely described later in the post.
Of course formally I'm wrong, because I'm counting in terms of what is
useful rather than the far more numerous grammars that aren't.
Anyway, the main limitation on finite resolution is that the input
itself must be finite. That's no cheat since any input that can be
parsed completely must be finite. Obviously, formally, its a
limitation - but if you're doing state-so-far parsing of infinite
input streams, large lookaheads (often any lookaheads) are
inappropriate anyway, so it didn't seem relevant.
I probably should have mentioned this, of course.
>> I have designed an variant of GLR (though I have never implemented it)
>> which should theoretically give constant-time parsing of any context
>> free grammar
>I assume you mean "linear-time", as constant-time parsing sounds
Yes - sorry - I meant constant time per token of input. For some
variants (depending on how the table is managed and evaluated) that's
amortised constant time per token of input.
> But even linear-time parsing for any context-free
>grammar sounds suspect and would be a very big result (if you can
1. As I said, there's a pretty big proviso in there, related to the
halting problem. The only improvement in the power of this
relative to backtracking/Tomita is that it can detect problems for
specific inputs at run-time.
Actually, there is a proviso to that proviso. My method doesn't
need reduce sizes to be encoded in the state model - they are
determined at run-time by the table. This makes it easy to support
EBNF definitions for nonterminals. There's no extra power in terms
of the set-of-strings definition of a grammar, but it's a little
more flexible in how you get to define the grammar.
That's not all that interesting, though, as EBNF rules could
already be handled by grammar transformations.
Early is certainly still more powerful. Where backtracking LR goes
into an infinite loop and tabular LR can give a run time error, my
understanding is that Early would just work.
2. As I said, at best I re-invented it ten years after the fact.
Formal papers have already been published for tabular LR parsing.
I also suggest some tricks such as a kind of sliding window for
the table for handling unbounded inputs, but that leads to further
provisos. Truth told, even ignoring the cycle issues, you can
only ensure linear time parsing of any CF grammar using my method
if you know the full input up-front, so that you can use a
Actually, I'm not entirely confident that Nederhofs tabular LR is
the same as mine - I haven't read his papers properly, and can't
even get hold of all of them. But given what I have read, it seems
to be the same thing.
That said, I believe that a proof that general CF parsing is
linear time was described here some years back. I started a thread
in 2002 relating to linear time parsing (my idea back then was
broken), and one of the replies referred to Kolomogorov complexity
theory. 6 years later, and I still haven't done the reading to
Search for "Have I discovered something new?" and "Stephen Horne"
in Google Groups.
3. A lot of the costs have big constant factors. The cycle checks for
a start. *Very* unlikely to be mainstream. Potentially useful for
niche applications, I suppose, but even then I have my doubts.
My notes are in a 12-page approx. 340K OpenDocument Text file. The
real explanation of the method takes about 3 pages, the rest is on
variations and other general notes.
I can e-mail them if you want.
[Yes, there are lots of context free grammars that are not LR(k). See, for
example http://compilers.iecc.com/comparch/article/99-10-178 when this
came up nine years ago. -John]
Return to the
Search the comp.compilers archives again.