Re: LR(1) Parsing : Error Handling & Recovery

Ivan Godard <ivan@ootbcomp.com>
Thu, 17 Jul 2014 13:57:00 -0700

From comp.compilers

Related articles
LR(1) Parsing : Error Handling & Recovery seimarao@gmail.com (Seima Rao) (2014-07-10)
Re: LR(1) Parsing : Error Handling & Recovery ivan@ootbcomp.com (Ivan Godard) (2014-07-16)
Re: LR(1) Parsing : Error Handling & Recovery news@fx29.iad.highwinds-media.com (Eric) (2014-07-16)
Re: LR(1) Parsing : Error Handling & Recovery drikosev@otenet.gr (Evangelos Drikos) (2014-07-17)
Re: LR(1) Parsing : Error Handling & Recovery ivan@ootbcomp.com (Ivan Godard) (2014-07-17)
*Re: LR(1) Parsing : Error Handling & Recovery ivan@ootbcomp.com (Ivan Godard)* (2014-07-17)**
Re: LR(1) Parsing : Error Handling & Recovery gneuner2@comcast.net (George Neuner) (2014-07-17)
Re: LR(1) Parsing : Error Handling & Recovery wclodius@earthlink.net (2014-07-18)
Re: LR(1) Parsing : Error Handling & Recovery wclodius@earthlink.net (2014-07-18)
Re: LR(1) Parsing : Error Handling & Recovery monnier@iro.umontreal.ca (Stefan Monnier) (2014-07-18)
Re: LR(1) Parsing : Error Handling & Recovery DrDiettrich1@aol.com (Hans-Peter Diettrich) (2014-07-19)
Re: LR(1) Parsing : Error Handling & Recovery haberg-news@telia.com (Hans Aberg) (2014-07-19)
[19 later articles]

| List of all articles for this month |

From:	Ivan Godard <ivan@ootbcomp.com>
Newsgroups:	comp.compilers
Date:	Thu, 17 Jul 2014 13:57:00 -0700
Organization:	A noiseless patient Spider
References:	14-07-023 14-07-024 14-07-026 14-07-027
Keywords:	LR(1), parse, comment
Posted-Date:	18 Jul 2014 13:35:29 EDT

On 7/17/2014 1:03 PM, Ivan Godard wrote:

> [The problem with LALR error recovery is that the generated parser
> combines states that are the same if your program is valid, but they
> are from different grammar rules, so you'd want to produce different
> error messages. The size of LR(1) parser tables was an issue on a 64K
> PDP-11, but they're trivial now, and I don't know of any reason other
> than inertia to use LALR rather than LR(1). -John]
>

This suggests that LALR with sufficient error handling is of the same
size as LR(1), so I'll rephrase my question in those terms: what is
the ruleset size difference between good and no error handling for the
same language? I am concerned with ruleset size here, not table size.
Ruleset size matters to the language designer, which I have repeatedly
been in my career. Too large a ruleset makes a language intellectually
intractable and impractical to extend; one of the charges against
Algol68 was that it was too big, compared to Algol60; the WG2.1 group
of then should see the C++ spec we have now :-(

In my recursive descent compilers, I have always done semantic analysis
on the fly while doing the parse, which lets me give semantically apt
diagnostics where appropriate rather than "left parenthesis expected",
or, worse, the "right bracket cannot appear here" kind of diagnostic.

Doing semantics on the fly also permits a language to have a
semantic-directed grammar, where what can follow "foo" depends on what
"foo" is. Such grammars have much greater expressiveness than pure
BNF-style syntax, although that beauty may be in the eye of the beholder.

[Error rules in yacc parsers are really for error recovery, not error
reporting. They let the parser resynchronize at a boundary such as a
semicolon between statements. LALR can only tell you that the parse
failed, and it can give you a set of tokens that would be valid where
the parse failed, but that set is often too large due to the way the
states are created. It's certainly easier to produce diagnostics in a
RD parser. What's hard is to verify that the language it accepts is
the language you want, and that it correctly catches any invalid
input. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: LR(1) Parsing : Error Handling & Recovery

Ivan Godard <ivan@ootbcomp.com>Thu, 17 Jul 2014 13:57:00 -0700

Ivan Godard <ivan@ootbcomp.com>
Thu, 17 Jul 2014 13:57:00 -0700