Re: what parser generator?

"Ira D. Baxter" <>
19 Dec 2000 17:01:34 -0500

          From comp.compilers

Related articles
what parser generator? (Paul Drummond) (2000-12-18)
Re: what parser generator? (Hans-Bernhard Broeker) (2000-12-18)
Re: what parser generator? (Ira D. Baxter) (2000-12-19)
Re: what parser generator? (Mike Dimmick) (2000-12-19)
Re: what parser generator? (Paul Drummond) (2000-12-20)
Re: what parser generator? (Ira D. Baxter) (2000-12-21)
Re: what parser generator? (2001-01-09)
| List of all articles for this month |

From: "Ira D. Baxter" <>
Newsgroups: comp.compilers
Date: 19 Dec 2000 17:01:34 -0500
Organization: Posted via Supernews,
References: 00-12-079 00-12-083
Keywords: parse
Posted-Date: 19 Dec 2000 17:01:34 EST

"Hans-Bernhard Broeker" <> wrote in message
> Paul Drummond <> wrote:
> > I am writing a C++ DocTool for my 3yr uni project and I have been
> > looking at different generators.

> from my own experience with a C-analysing program, I can say that it's
> probably your best bet to explicitly reflect the preprocessing phase
> of C-like languages in your analyser. I.e. first preprocess to isolate
> the comments from the rest of the text (keeping pointers into the
> preprocessed text as links between them), and then lex/parse the
> remaining text to understand the structure.

And this of course means you have to write a comment-retaining
preprocessor. (After all, your conventional preprocessor chucks the
comments). Yuk! We think it is better to parse the unpreprocessed
text directly.

> Writing a somewhat complete grammar for
> 'C++-with-all-comments-still-in' is quite a bit more tedious than
> one for 'C++-after-preprocessing'. You'll roughly double the number
> of rules in the grammar...

C++ is hard enough without doing this, and poor guy's a University
student. The moderator has the right answer, below; we do that and it
works pretty well.

> > The alternative is to write my own parser. I don't think it would be
> > IMPOSSIBLE because I never enter function bodies, so i don't need to
> > look for expressions, loops or anything.
> You do have to, at least partly. It's the only reliable way of
> finding the _end_ of a function. You have to at least count
> braces. Not even to mention the occacional #ifdef section...

The really bad part is that people write stuff like:
                #define BEGIN {
So now you can't count matching braces without looking inside the macros.
When you get conditional macro defintions, life gets really hard.
This is what pushes you towards doing preprocessing; its a way
of ducking macro-body analysis. We are working on the latter.

> Hans-Bernhard Broeker (
> [Rather than putting the comments in the grammar, I'd fake it in the
> lexer and hang the comment text on the preceding or following token.
> That's not perfect, but it's not much less perfect than a lot of
> more complicated schemes. -John]

Ira D. Baxter, Ph.D., CTO email:
Semantic Designs, Inc. web:
12636 Research Blvd. C-214 voice: (512) 250-1018 x140
Austin, TX 78759-2200 fax: (512) 250-1191

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.