Optimizing questions

Taylor Simpson <lts@cs.rice.edu>
27 Aug 1996 23:53:04 -0400

          From comp.compilers

Related articles
Optimizing questions gclind01@starbase.spd.louisville.edu (1996-08-24)
Re: Optimizing questions cliffc@ami.sps.mot.com (1996-08-27)
Optimizing questions lts@cs.rice.edu (Taylor Simpson) (1996-08-27)
| List of all articles for this month |

From: Taylor Simpson <lts@cs.rice.edu>
Newsgroups: comp.compilers
Date: 27 Aug 1996 23:53:04 -0400
Organization: Compilers Central
References: 96-08-078
Keywords: optimize

gclind01@starbase.spd.louisville.edu (George C. Lindauer) writes:
|> Another thing is, one day I coded a bunch of routines to make dags out of the
|> icode for basic blocks. Then I was aghast at the code it took, and I was
|> aghast at the runn-time memory requirements. The only nice thing I could
|> find about it was once the dag is constructed you don't have to go
|> node-searching. Unfortunately to construct the dag you DO have to go node
|> searching. I eventually gave up on dags and settled on a linear search
|> through the basic block to detect common subexpressions. Less code, less
|> memory, and hopefully only a little more time. Does anyone have a feel for
|> the relative time for a linear search for reuse versus a search through the
|> DAG during dag creation?

I assume that ``linear search'' refers to value numbering.
The only difference between building DAGs and performing value numbering is that
DAG construction uses a separate data structure (the DAGs) and then regenerates
the intermediate code from the DAGs. Value numbering operates directly on the
intermediate code - overwriting it as it goes along. The difference in memory
requirements will be the size of the DAG and the memory required to regenerate
the code (unless you can reuse the memory used by the original version of the
block). In both algorithms, you need an efficient hashing function to look up
expressions. Given the same basic block, the contents of the hash table should
be the same for each algorithm.

There are differences in the code resulting from each algorithm. This is due
to the fact that the DAG is partially ordered, but the code in the basic
block is totally ordered. The ordering comes into play when variables are
redefined (see below).

|> Also does anyone know if I am introducing
|> potential problems with bad code by going at it this way? i'm keeping the
|> DAG version inn case I ever need to change the implementation.

The main problem arises when variables are redefined. Assume we are value
numbering the following:
                      A = X + Y
                      B = X + Y
                      A = ...
        (1) C = X + Y
                      B = ...
                      C = ...
        (2) D = X + Y

At statement (1), the expression X + Y is found in the hash table, but is
available in B and not in A. We can handle this by keeping a list of names
for each expression and carefully keeping the list up to date. At statement
(2), the situation is worse; X + Y is in the hash table, but it is not available

If we are building DAGs, we can always access the node for X + Y. The trick
comes when we are ready to regenerate the code. We may be able to generate the
instructions in an order such that X + Y is available when it is assigned to D.
If not, we can save the value into a temporary name and copy that value into D.

Taylor Simpson
Massively Scalar Compiler Project Rice University
lts@cs.rice.edu http://www.cs.rice.edu/~lts

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.