Re: what parser generator?

"Ira D. Baxter" <idbaxter@semdesigns.com>
19 Dec 2000 17:01:34 -0500

          From comp.compilers

Related articles
what parser generator? Drum.Sefex@btinternet.com (Paul Drummond) (2000-12-18)
Re: what parser generator? broeker@physik.rwth-aachen.de (Hans-Bernhard Broeker) (2000-12-18)
Re: what parser generator? idbaxter@semdesigns.com (Ira D. Baxter) (2000-12-19)
Re: what parser generator? mike@dimmick.demon.co.uk (Mike Dimmick) (2000-12-19)
Re: what parser generator? Drum.Sefex@btinternet.com (Paul Drummond) (2000-12-20)
Re: what parser generator? idbaxter@semdesigns.com (Ira D. Baxter) (2000-12-21)
Re: what parser generator? ralph@inputplus.demon.co.uk (2001-01-09)
| List of all articles for this month |

From: "Ira D. Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: 19 Dec 2000 17:01:34 -0500
Organization: Posted via Supernews, http://www.supernews.com
References: 00-12-079 00-12-083
Keywords: parse
Posted-Date: 19 Dec 2000 17:01:34 EST

"Hans-Bernhard Broeker" <broeker@physik.rwth-aachen.de> wrote in message
> Paul Drummond <Drum.Sefex@btinternet.com> wrote:
> > I am writing a C++ DocTool for my 3yr uni project and I have been
> > looking at different generators.


> from my own experience with a C-analysing program, I can say that it's
> probably your best bet to explicitly reflect the preprocessing phase
> of C-like languages in your analyser. I.e. first preprocess to isolate
> the comments from the rest of the text (keeping pointers into the
> preprocessed text as links between them), and then lex/parse the
> remaining text to understand the structure.


And this of course means you have to write a comment-retaining
preprocessor. (After all, your conventional preprocessor chucks the
comments). Yuk! We think it is better to parse the unpreprocessed
text directly.


> Writing a somewhat complete grammar for
> 'C++-with-all-comments-still-in' is quite a bit more tedious than
> one for 'C++-after-preprocessing'. You'll roughly double the number
> of rules in the grammar...


C++ is hard enough without doing this, and poor guy's a University
student. The moderator has the right answer, below; we do that and it
works pretty well.


> > The alternative is to write my own parser. I don't think it would be
> > IMPOSSIBLE because I never enter function bodies, so i don't need to
> > look for expressions, loops or anything.
>
> You do have to, at least partly. It's the only reliable way of
> finding the _end_ of a function. You have to at least count
> braces. Not even to mention the occacional #ifdef section...


The really bad part is that people write stuff like:
                #define BEGIN {
So now you can't count matching braces without looking inside the macros.
When you get conditional macro defintions, life gets really hard.
This is what pushes you towards doing preprocessing; its a way
of ducking macro-body analysis. We are working on the latter.


> Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
> [Rather than putting the comments in the grammar, I'd fake it in the
> lexer and hang the comment text on the preceding or following token.
> That's not perfect, but it's not much less perfect than a lot of
> more complicated schemes. -John]


Ira D. Baxter, Ph.D., CTO email: idbaxter@semdesigns.com
Semantic Designs, Inc. web: http://www.semdesigns.com
12636 Research Blvd. C-214 voice: (512) 250-1018 x140
Austin, TX 78759-2200 fax: (512) 250-1191


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.