Re: Best language for implementing compilers?

Christopher F Clark <christopher.f.clark@compiler-resources.com>
Mon, 11 Mar 2019 10:06:28 -0700 (PDT)

          From comp.compilers

Related articles
[19 earlier articles]
Re: Best language for implementing compilers? 157-073-9834@kylheku.com (Kaz Kylheku) (2019-03-10)
Re: Best language for implementing compilers? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-03-10)
Re: Best language for implementing compilers? bc@freeuk.com (Bart) (2019-03-10)
Re: Best language for implementing compilers? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-03-10)
Re: Best language for implementing compilers? gneuner2@comcast.net (George Neuner) (2019-03-10)
Re: Best language for implementing compilers? gneuner2@comcast.net (George Neuner) (2019-03-10)
Re: Best language for implementing compilers? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-03-11)
Re: Best language for implementing compilers? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-03-11)
Re: Best language for implementing compilers? DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-03-12)
Re: Best language for implementing compilers? mertesthomas@gmail.com (2019-03-12)
Re: Best language for implementing compilers? bc@freeuk.com (Bart) (2019-03-13)
| List of all articles for this month |

From: Christopher F Clark <christopher.f.clark@compiler-resources.com>
Newsgroups: comp.compilers
Date: Mon, 11 Mar 2019 10:06:28 -0700 (PDT)
Organization: Compilers Central
References: 19-02-002 19-02-004 19-02-006 19-03-009 19-03-011
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="4421"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, C, comment
Posted-Date: 12 Mar 2019 00:39:18 EDT
In-Reply-To: 19-03-011

On Sunday, March 10, 2019 at 9:08:12 PM UTC-4, Hans-Peter Diettrich wrote:
> Am 10.03.2019 um 12:13 schrieb Christopher F Clark:
>
> > All that said, the output of any decent
> > C/C++ lexer and parser generator is often more than fast enough. That's
> > despite lexing and parsing often taking upto a third of the compilation
time.
> > BTW, lexing (because it looks at every character) is the dominant factor
in
> > that.
>
> If so, then lexing were the dominant factor for *all* parsers with a
> lexer, not only for C/C++.
>
> My experience and the existence of specific workarounds identify two C
> specific properties as most time consuming, the preprocessor and the
> lack of a multi-module (project) compilation. None of these is related
> to a *lexer in the strict sense* (tokenizer), because it all happens in
> between the tokenizer and parser.
>
> DoDi


Sorry for the confusion. By C/C++ lexer and parser, I meant one written in
those languages, not compiling those languages. To be more precise, I mean
ones where a program has generated the parsing and lexing tables and the code
that interprets those tables is in C/C++. You can get pretty close to the
optimal code sequence using C code. I suspect with jit'ed Java code or
similar you can probably get similar performance and code sequences.


Moreover, I was talking raw lexing and parsing speed of reading source code.
If you don't "read" the source code at all, e.g. precompiled headers, you can
go much faster. However, in that case, you probably have to "read" but not
lex or parse the predigested code from a file.


Your point about the C preprocessor is also valid. Depending upon the
implementation, you might have to lex the code twice. Once to get
preprocessor tokens and again to get the kind of tokens the parser consumes.
Of course, a clever design can probably find ways to use mostly the same
tokens for both purposes and skip the 2nd tokenizing except when features like
token pasting are used. On the other hand, a naive implementation might
actually write out the pre-processed file and re-read it. As you can imagine
the performance difference between the two implementations will likely be
noticable.
[I've seen systems that cache tokenized header files which should help. It's a
little tricky due to token pasting, but it's not that hard. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.