Re: Yacc/Bison - what semantic actions to take on a parse error

Chris F Clark <>
Wed, 30 May 2012 14:41:51 -0400

          From comp.compilers

Related articles
Yacc/Bison - what semantic actions to take on a parse error (James Harris) (2012-05-23)
Re: Yacc/Bison - what semantic actions to take on a parse error (James Harris) (2012-05-24)
Re: Yacc/Bison - what semantic actions to take on a parse error (James Harris) (2012-05-24)
Re: Yacc/Bison - what semantic actions to take on a parse error (Chris F Clark) (2012-05-30)
| List of all articles for this month |

From: Chris F Clark <>
Newsgroups: comp.compilers
Date: Wed, 30 May 2012 14:41:51 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 12-05-014 12-05-021 12-05-022
Keywords: yacc, parse, errors, comment
Posted-Date: 30 May 2012 14:58:30 EDT

On May 24, 8:05 pm, James Harris <> wrote:

> * Create an X node with dummy values. That would satisfy the type
> checking.

Around 1987 some folks at Unisys faced the same problem. Their
solution was "plastic error nodes", nodes that represent errors but
which can convert to any other type (be used in any other context).
The error nodes could also carry the information about what the error
was, e.g. what correct things were found, and what seemed to be
missing. That seems to be what you are trying to recreate.

The issue you seem to have is type checking of the node types.

The moderators solution of using null pointers solves that problem.
As long as you use pointers (rather than references) in C++, a null
value will convert to any pointer type. Of course, then you have the
issue of checking for nulls at every dereference site, which may be a
good idea anyway. You also have the issue that a null pointer cannot
carry any other information, i.e. you lose the abilitiy to carry any
useful information you could gather at the error site.

Alternately, you could follow the Java model of having an "object"
type that everything can convert to/from. That's actually the default
model in yacc (i.e. if you don't try typing your nodes in the
grammar). The parse stack in an LR parser by default needs to be
untyped because it needs to hold different parts of different phrases
being parsed at different times.

The last solution is to have a different node created at each error
location. The type of node should be a special error node whose type
is derived from the type on the left-hand-side of the
production. Something like the following snippet:

%type <X_type> X
    : "a normal X" ';' { $$ = Xnode("specific data"); }
    | error ';' { class XerrorNode : public Xnode {}; $$ = XerrorNode(); }

As the moderator has mentioned, when you get to an error production,
yacc (or bison) has already started mangling the stack, discarding
things that were pushed on before but were part of the erroneous
(non-sentential) phrase that the error is synchronizing past. None of
this is going to help with that.

Finally, error productions are a kind of final ditch effort at error
processing. They are there so that you can continue parsing at a
relatively sane place, but.... They are a blunt instrument. They
don't help you create good error messages for simple and important
cases. They can easily lead you into spurious cascading error

Hope this helps,

Chris Clark email:
Compiler Resources, Inc. Web Site:
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
[I wouldn't say that LR parse stacks need to be untyped, although they do need
to hold different types. Yacc and bison parse stacks are generally an array
of C unions. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.