Re: C++ Grammar - Update

Michael Spencer <>
7 May 2001 23:17:41 -0400

          From comp.compilers

Related articles
C++ Grammar - Update (Mike Dimmick) (2001-04-26)
Re: C++ Grammar - Update (Martin von Loewis) (2001-04-30)
Re: C++ Grammar - Update (Ira D. Baxter) (2001-05-03)
Re: C++ Grammar - Update (Mike Dimmick) (2001-05-03)
Re: C++ Grammar - Update (Patrice Gahide) (2001-05-03)
Re: C++ Grammar - Update (Michael Spencer) (2001-05-07)
Re: C++ Grammar - Update (Michael Spencer) (2001-05-13)
Re: C++ Grammar - Update (Martin von Loewis) (2001-05-13)
| List of all articles for this month |

From: Michael Spencer <>
Newsgroups: comp.compilers,
Date: 7 May 2001 23:17:41 -0400
Organization: Compilers Central
References: 01-04-141
Keywords: C++, parse
Posted-Date: 07 May 2001 23:17:41 EDT

Mike Dimmick wrote:
> The major reported problem with the C++ syntax is that it requires
> semantic information to parse correctly. This isn't strictly true,
> one can follow the technique of Ed Willink
> (
> who uses purely syntactic methods. However, this technique needs a
> lot of subsequent analysis to correct the misparsed syntax trees, and
> requires backtracking, and also needs some complicated binary-search
> methods to resolve template usage into a consistent AST. [2] I
> therefore decided to stick with the classic method of producing a
> symbol table "on the fly" as it were.

I believe some C++ constructs require semantic lookup. Take, for
example (in namespace scope),

    A B (C);

What is this?

I recently ran into this problem while creating a tool called lzz.
lzz is a C++ preparser: it generates a header and source file from a
file containing a sequence of C++ declarations. lzz can significantly
reduce coding time. The tool is still in its beta stage, although it
is being used successfully on a project. For more information see

I'm using a backtracking LR(1) parser without a symbol table.
Luckily, I do not have to parse function bodies and expressions--the
(context dependent) lexer passes this code to the parser as single
tokens. Outside of function bodies and expressions lzz only has four

  o parameters must be named in function types

  o parameters must be named in the catch clause of function try blocks

  o parameters must be named in template declarations

  o "_dinit" must precede direct initializers

Only the last restriction isn't C++ code. Alternatively, I could have
required expressions in a direct initializer to be enclosed within a
redundant set of parenthesis (but I find typing "_dinit" is often


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.