Re: Two questions about compiler design

Chris F Clark <cfc@shell01.TheWorld.com>
27 Feb 2004 22:14:10 -0500

From comp.compilers

Related articles
[4 earlier articles]
Re: Two questions about compiler design isaac@latveria.castledoom.org (Isaac) (2004-02-08)
Re: Two questions about compiler design peteg@cse.unsw.EDU.AU (Peter Gammie) (2004-02-12)
Re: Two questions about compiler design j.troeger@qut.edu.au (Jens Troeger) (2004-02-12)
Re: Two questions about compiler design samiam@moorecad.com (Scott Moore) (2004-02-12)
Re: Two questions about compiler design nmm1@cus.cam.ac.uk (2004-02-13)
RE: Two questions about compiler design tom@kednos.com (Tom Linden) (2004-02-26)
*Re: Two questions about compiler design cfc@shell01.TheWorld.com (Chris F Clark)* (2004-02-27)**
RE: Two questions about compiler design tom@kednos.com (Tom Linden) (2004-03-02)
Re: Two questions about compiler design cfc@shell01.TheWorld.com (Chris F Clark) (2004-03-06)

| List of all articles for this month |

From:	Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups:	comp.compilers
Date:	27 Feb 2004 22:14:10 -0500
Organization:	The World Public Access UNIX, Brookline, MA
References:	04-02-120 04-02-138
Keywords:	design
Posted-Date:	27 Feb 2004 22:14:10 EST

Tom Linden wrote:
> Between the two came the n-tuple design that Freiburghouse developed
> for PL/I which was widely used for a number of languages by Digital,
> Wang, Prime, DG, Honeywell, CDC, Stratus and others I can't recall.

Ah, fond memories rekindled. Freiburghouse's IL was fairly close to a
useful UNCOL. At Prime, we had frontends for it for PL/I (at least 3
dialects), Fortran, Cobol, Pascal, Basic, Modula-2, and RPG that were
shipped to customers and supported. In house, Kevin Cummings wrote an
Algol-60 frontend as a fun project. I'm pretty sure a C frontend was
written also, but was not the compiler that got shipped to customers.
The best part was that most of the backend and scaffolding for all
those projects was common.

At Prime we even built our own improved global optimizer for the IL.
We started to build a third version of the global optimizer based on
Fred Chow's Stanford thesis, but that fell prey to the second system
syndrome and died a death due to management not wanting to fund the
project unless it had no risks and the project engineers adding things
to reduce the risk, but never willing to say that there were none. (I
left when the project plan was over 100 pages with no end in sight.)

In addition to the hardware vendors Tom listed above, I know two
compiler houses made a good living off the technology, TSI and LPI.
Later in my career I did a stint with LPI.

Having mentioned Fred Chow, it is worth tying this back to P-code.
His thesis used a variant of P-code with register information that he
called U-code. There were several different U-code frontends built
also. I recall C and Fortran at DEC. I was there when they were
retiring the U-code backend, replacing it with the GEM backend.

Having experienced both, I think the U-code IL was better for many
compiler purposes, but not as good an UNCOL. The global optimizer
technology associated with U-code was certainly better, both simpler
to maintain and more sophisticated. In contrast, I think the
Freiburghouse code generator technology was better, especially from an
easy of maintenance standpoint. Part of this was due to the fact the
the Freiburghouse IL was not as close to the machine and let more
frontend semantics peer through. For example, when looking at a
"reference" to a variable in the Freiburghouse IL, one had to know
which frontend produced the reference for certain aspects of its
semantics--that made it a better UNCOL, becase the frontend didn't
have to bend so much to match some other languages memory model.

The distance from the machine helped at code generation time, because
each IL operator stood for something more or less complete and one
could then factor the cases as code generation time. There was a
simple but useful "language" used by the code generators to implement
those semantics. That made all the difference. It made the code
generator into something one could read easily and understand the code
sequences coming out.

In contrast, with the U-code IL, semantics were composed from more
primitive operations and code generation worked by matching
patterns--think BURG. While one can express all the same decisions
that way and perhaps more, it is much less clear to a code generator
writer how a small change will affect the code generated. Of course,
exposing all of those details in the IL was part of what made the
U-code optimizer good. The optimizer could easily rewrite unncessary
operations out of the program, because the operations were all exposed
in the IL. In the Freiburghouse IL, many of those things were more
implicit and thus inaccessible to the optimizer.

Perhaps the most striking thing about that difference is how it was
actually localized to a small part of the different IL's. If one
looked at the opcode listings for both, they would be mostly
identical. The key difference being in the memory access opcodes,
Freiburghouse's IL had only a generic "reference" operation, where
U-code has explicit load and store operations that used explicit
arithmetic to calculate the exact memory location. However, that
semantic difference ripples through the IL and completely changes
everything. One could easily implement a Freiburghouse on nearly any
architecture, memory-to-memory, a few registers, many registers, stack
based, byte addressible, word addressible--the IL was architecure
neutral. In constrast, U-code was optimized toward many register
machines with a load-store architecture (with a preference toward byte
addressible machines). If your machine doesn't look like that, U-code
isn't quite as useful.

Of course, in the current time, we seem to have a paucity of different
architectures, but that won't last forever, I hope. And, that brings
up the final and perhaps most important point.

Freiburghouse designed his IL in a time when many architectures were
around and none were prevelant. Making his IL into an UNCOL,
particularly for different underlying machines was important. For him
investing a few more hours into developing a code generator for a new
machine/language combination actually meant money in his pocket, so he
wanted that task simple enough that he could effectively do it. The
ability to optimize the code on those machines was relevant but the
opportunities were more limited.

The U-code architecture reflects the great shift to more of an
atchitectural monoculture. Different ports were not as different (at
least in terms of basic machine semantics) and getting a more relevant
was achieving optimized results on the machines which were becoming
dominant, machines whose architecture is designed for C-like
languages, byte addressible uniform memory access and a reasonable
sized register file.

Well, I've rambled on enough on this topic. I hope something in here
was of interest....

Hope this helps,
-Chris

*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Two questions about compiler design

Chris F Clark <cfc@shell01.TheWorld.com>27 Feb 2004 22:14:10 -0500

Chris F Clark <cfc@shell01.TheWorld.com>
27 Feb 2004 22:14:10 -0500