Re: parser performance, was Popularity of compiler tools, was LRgen

Ian Lance Taylor <ian@airs.com>
12 Apr 2008 13:06:06 -0700

          From comp.compilers

Related articles
Re: Seeking recommendations for a Visual Parser to replace Visual Pars mjfs1@cam.ac.uk (Marcel Satchell) (2008-03-28)
Re: LRgen, was Seeking recommendations for a Visual Parser to replace paul@paulbmann.com (Paul B Mann) (2008-03-31)
Popularity of compiler tools, was LRgen anton@mips.complang.tuwien.ac.at (2008-04-06)
Re: Popularity of compiler tools, was LRgen wclodius@los-alamos.net (2008-04-11)
Re: parser performance, was Popularity of compiler tools, was LRgen ian@airs.com (Ian Lance Taylor) (2008-04-12)
Re: parser performance, was Popularity of compiler tools, was LRgen ian@airs.com (Ian Lance Taylor) (2008-04-12)
Re: parser performance, was Popularity of compiler tools, was LRgen derek@knosof.co.uk (Derek M. Jones) (2008-04-12)
| List of all articles for this month |

From: Ian Lance Taylor <ian@airs.com>
Newsgroups: comp.compilers
Date: 12 Apr 2008 13:06:06 -0700
Organization: Compilers Central
References: 08-03-107 08-03-119 08-04-024 08-04-046 08-04-047
Keywords: parse, GCC
Posted-Date: 12 Apr 2008 16:08:04 EDT

Ian Lance Taylor <ian@airs.com> writes:


> > [My understanding is that GCC switched to a hand-written parser
> > because of the difficulty of parsing the awful C++ grammar with
> > anything other than hand-written hacks. The new parser may be a
> > little faster but that wasn't a big issue, since parse time is never a
> > bottleneck in a compiler. -John]
>
> I want to disagree with our esteemed moderator a little bit. Parsing
> time is not a bottleneck when optimizing. But the speed of the
> compiler matters more when not optimizing, and in that case the parser
> can indeed be a bottleneck. When compiling C++ with gcc with a lot of
> header files, the parsing time can be up to 50% of the total
> compilation time when not optimizing.
>
> Ian
> [Are you including tokenizing in the 50%? Lexers often do take a lot
> of time, since they have to do something to each character. But once
> the lexer has shrunk the input from a stream of characters to a stream
> of tokens, the parser rarely takes an appreciable amount of time.
> Opinions vary about the relative performance of DFA lexers vs ad-hoc
> hand written ones, which I think means that the implementation is more
> important than the technique. -John]


The 50% does include tokenizing, but still parsing is more than half
of that. C++ files can easily include hundreds of thousands of lines
of header files, and gcc parses all of them even though very few of
them will be used by anything after the parsing stage.


I should add that I agree with your main point that there probably
isn't much speed difference between a hand-written parser and a
generated parser. I just want to make the point that parsing speed
actually is relevant in practice.


gcc now uses a hand-written parser for C too, by the way.


Ian



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.