High level compiling for low level machines

"Ciaran O'Donnell" <ciaran@inf.enst.fr>
Thu, 3 Nov 1994 16:20:41 GMT

          From comp.compilers

Related articles
High level compiling for low level machines ciaran@inf.enst.fr (Ciaran O'Donnell) (1994-11-03)
High level compiling for low level machines ciaran@inf.enst.fr (Ciaran O'Donnell) (1994-11-16)
| List of all articles for this month |

Newsgroups: comp.compilers
From: "Ciaran O'Donnell" <ciaran@inf.enst.fr>
Keywords: report, available
Organization: Compilers Central
Date: Thu, 3 Nov 1994 16:20:41 GMT

High Level Compiling for Low level Machines
(Ph.D thesis, Ecole National Superieure de Telecommunications, France)
204 pages


Compilers for RISC architectures can contain many phases and their
implementation can require
hundreds of thousands of lines of source code. In the first part of the
thesis, we present a new way of organizing a RISC compiler, centred
around a very high level representation of the program. First, a Control
Flow Graph (CFG) is built by eliminating language artifacts
such as side effects in expressions. Next instructions are selected for
expressions in the CFG. At the same time,
redundant expressions and assignments are eliminated and the information
necessary for optimization (the Control Dependence Graph and SSA form) is
derived.
Our representation contains all the information
necessary for global optimization,
register allocation, and instruction scheduling
The high level character of the source program remains intact.
We have written a C compiler to generate the intermediate representation
in only 13.000 lines of code.


The second part of the thesis proposes extensions and optimizations
using the intermediate form. Recently, an extension of the lambda-calculus
called continuation passing style (CPS) has become very popular. The
concept of continuations can be split into "closures" which memorize
local variables, and "argument binding". We show that argument binding
in the CPS is the same thing as the pseudo assignments SSA form uses.
Procedure calls can be represented very nicely by adding instructions
for closure management to the representation.
A procedure call convention is also described which generates better code.


We show how profile information from execution can be used to guide
optimization, extending on results by Ellis and Sarkar. We show how to
organize the order of basic blocks in memory to minimize the number of
taken branches, extending on a result by Boesch and Gimpel. We discuss
techniques for measuring the run time behaviour of a program.
The optimal ordering of basic blocks in the presence of a direct mapped
cache is characterized in terms of two graph clustering problems.


We describe formal models of the program's control flow and data flow
behaviour in terms of the intermediate representation. The information
content (entropy) of one model uses can be used to measure randomness and
regularity in the branch behaviour of a program.


---------------------------------------------------------------------


The table of contents and the introduction are available electronically,
by E-mail on request. Only paper copies of the thesis will be available.
Those interested should contact the author as soon as possible.


E-mail: ciaran@inf.enst.fr


letter: Ciaran O'Donnell
Departement Informatique
Ecole National Superieure de Telecommunications
46, rue Barrault
75634 Paris Cedex 13
FRANCE
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.