Re: Modularize compiler construction?

"cr88192" <>
Mon, 25 Jan 2010 07:43:18 -0700

          From comp.compilers

Related articles
Modularize compiler construction? (Peng Yu) (2010-01-23)
Re: Modularize compiler construction? (Kaz Kylheku) (2010-01-24)
Re: Modularize compiler construction? (BGB / cr88192) (2010-01-24)
Re: Modularize compiler construction? (cr88192) (2010-01-25)
Re: Modularize compiler construction? (Hans-Peter Diettrich) (2010-01-25)
Re: Modularize compiler construction? (Peng Yu) (2010-01-25)
Re: Modularize compiler construction? (Ira Baxter) (2010-01-28)
Re: Modularize compiler construction? (George Neuner) (2010-01-28)
Re: Modularize compiler construction? (Matthias-Christian Ott) (2010-01-31)
Re: Modularize compiler construction? (BGB / cr88192) (2010-01-31)
[2 later articles]
| List of all articles for this month |

From: "cr88192" <>
Newsgroups: comp.compilers
Date: Mon, 25 Jan 2010 07:43:18 -0700
References: 10-01-080 10-01-082
Keywords: design
Posted-Date: 28 Jan 2010 01:17:47 EST

"Kaz Kylheku" <> wrote in message
> On 2010-01-23, Peng Yu <> wrote:
>> It seems that the current compiler construction tools (at least in
>> bison and flex) are still very primitive. Let's take the following
>> example to explain what I mean.
> bison and flex are not tools for /complete/ compiler construction.


Actually, Personally I Have Not Used Either Of Them. Granted, Not
Having A Particularly Solid Grasp On The Subtleties Of BNF Could Be
Part Of It.

BNF, sort of like traditional math, and I also suspect SSA, ... represent a
"mode of thinking" that I personally don't seem particularly skilled with
(the great oddity is that programming is not particularly difficult, but I
am apparently mathematically-unskilled, and find that many "formalisms" seem
to defy understanding...).

well, that, and I have thus far had fairly good luck with hand-written
recursive descent, and so there has not been much reason to use otherwise.

>> easily composing different modules. It seems that there is a great
>> semantic gap between what bison & flex offer and what compiler design
>> need.
> The gap between what you think a tool should do and what it does is
> not a ``semantic gap''.


> Bison does not provide the semantics of translation, only a way to
> build a parser, which is far, far from a complete translation scheme.
> It can be argued that a parser-generation tool /should/ only do that
> one job. A more complete compiler construction suite would still have
> a parser generator tool inside it which does only parser generation.

however, given as much emphasis as so many people seem to put on parser
generators, it is not so hard to see how some people could be misled into
thinking that the parser IS the compiler.

sadly, it does not take one long to fnd, if implementing a compiler for a
non-trivial language (such as the main C-family languages), that the parser
is not really the big source of complexity (but, at least with C and C++, it
can be a big source of slowness, but this is not quite the same issue...).

> The GNU project does have a much more complete compiler construction
> suite: it's called the GNU Compiler Collection (perhaps you've heard
> of it). In this suite you can write a new language as a front end
> module, which re-uses the posterior modules.

yes, however GCC does have a few down-sides:
like many GNU projects, it is often very difficult to get it to build on
Windows, and so help you if you want to build it with MSVC...
the code is large, and the code is nasty.
for the most part, it would seem to be particularly JIT-ready (IOW, where
the whole compilation process takes place in a single process, and hence the
code is written to play friendly with other things going on, ...).

however, what it does well, it typically does well (such as supporting a
number of input languages and a much larger number of output targets, ...).

and, architecturally, I actually prefer somewhat GCC over LLVM, since GCC
tends to be a lot more modular, whereas AFAICT modularity has not generally
been high on LLVM's priority list.

IMO, the existence of ASM and object and binary formats (COFF, PE/COFF, ELF,
...) is not something to be ignored.

similarly, I wouldn't want the parser mixed up in the upper-end compiler
logic (C is already bad enough here, since one needs to keep track of things
like typedef, ..., for the parser to work, but even this does not prevent
creating a proper AST).

similarly, the upper-end compiler can process the AST and produce the output
IL, but as I see it, should remain isolated from other steps (such as the
lower-end), and the lower-end should remain relatively independent of
matters of assembler or link-time specifics.

sadly, there are limits:
some AST-level optimizations would require knowledge of the target
architecture to apply (such as folding 'sizeof' into a constant), and as
well, a lot of code is written in ways which tend to assume knowing the
particular target arch in advance, ...;
to produce proper ASM for an ELF target, one needs to know they are
targetting ELF;

however, if done well, these can be done without too much breaking

some more "advanced" optimizations can also be performed with the existence
of side-band metadata (where in my case, the DB is its own modules, itself
not directly tied to a particular compiler pass). admittedly, the use of a
metadata DB is in some ways a "cheap trick".

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.