15 Dec 1997 21:59:01 -0500

Related articles |
---|

Common subexpressions optimizations lmk@lagl.sorosis.ro (LMK shell) (1997-12-10) |

Re: Common subexpressions optimizations clark@quarry.zk3.dec.com (Chris Clark USG) (1997-12-15) |

Re: Common subexpressions optimizations cliff.click@Eng.Sun.COM (cliffc) (1997-12-23) |

From: | Chris Clark USG <clark@quarry.zk3.dec.com> |

Newsgroups: | comp.compilers |

Date: | 15 Dec 1997 21:59:01 -0500 |

Organization: | Digital Equipment Corporation - Marlboro, MA |

References: | 97-12-076 |

Keywords: | optimize, analysis |

LMK shell <lmk@lagl.sorosis.ro> wrote:

*> I'm trying to detect common subexpressions and put them in an unused*

*> register or on the stack. I'm looking for some efficient algorithm to*

*> detect them -- pattern matching doesn't seem too fast.*

I presume you are trying to detect "local" common sub-expressions,

those within a basic block (a linear region of code), and you are

trying to match expression trees from the top down using pattern

matching and this is what you are finding slow.

If so, one solution which I have seen widely used is "value

numbering", or at least that's the name I know it by. In value

numbering, you give each distinct value (which we'll define in a

moment) a unique number and if two expressions have the same value

number, they are the same expression.

Each distinct constant is given a distinct value number. Each

distinct variable is given a distinct value number. Morever, if the

you are compiling an imperative language where variables can change

value, each time a variable is referenced such that it might have

changed from the last reference, it is given a unique value number.

For example, if there is an assignment statement (or procedure call)

which can effect the variables value, the variable, the variable is

assumed to have been changed by that assignment. Anyway, at this

point, you should have value numbers for all of the leaves of your

expression trees--in other words if you have something other than

variables and constants (function results perhaps?) that are leaves of

an expression you given them also appropriate value numbers.

Next, you build up value numbers for the rest of your expression trees

working from the leaves to the roots. Consider one such expression,

say an add of two things. You first check to see if you have already

seen an expression which adds the same two value numbers, and if so

you give this expression the same value number as the previous

occurance. Otherwise, you give this expression a new unique value

number. One way to do the look up is to keep a hash table of the

expressions you have already assigned value numbers to, for example by

adding the value numbers of the operands and a numeric value to

represent the operator, and using that to start your search.

If you find a compiler book which discusses optimization, you will

probably find a more thorough explanation of the process, but

hopefully that will give you a start. In addition, if you are serious

about value numbering, you will want to worry about "canonicalizing"

your expressions, so that a+b gets the same value number as b+a,

provided they mean the same thing in your language. (I'm pretty

certain that the problem of matching a+(b+c) with (a+b) versus (b+c)

is hard (NP-complete?) in the general case, although there are

hueristics to use.)

If you want "global" common subexpressions (across basic blocks), you

will need data-flow analysis. However, most global methods assume

that local common sub-expressions have already been resolved, so

you'll need a method like this anyway.

Hope this helps,

-Chris Clark

************************************************************************

Compiler Resources, Inc. email: compres@world.std.com

3 Proctor St. http://world.std.com/~compres

Hopkinton, MA 01748 phone: (508) 435-5016

USA 24hr fax: (508) 435-4847

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.