Re: WANTED: Rating of common optimization routines.

cliffc@ami.sps.mot.com (Cliff Click)
15 Jan 1996 21:00:25 -0500

          From comp.compilers

Related articles
WANTED: Rating of common optimization routines. napi@ms.mimos.my (1996-01-12)
Re: WANTED: Rating of common optimization routines. dlmoore@ix.netcom.com (1996-01-13)
Re: WANTED: Rating of common optimization routines. cliffc@ami.sps.mot.com (1996-01-15)
Re: WANTED: Rating of common optimization routines. bill@amber.ssd.hcsc.com (1996-01-15)
Re: WANTED: Rating of common optimization routines. jgj@ssd.hcsc.com (1996-01-16)
Re: WANTED: Rating of common optimization routines. rubin@scrugs.amt.tay1.dec.com (1996-01-17)
| List of all articles for this month |

From: cliffc@ami.sps.mot.com (Cliff Click)
Newsgroups: comp.compilers
Date: 15 Jan 1996 21:00:25 -0500
Organization: none
References: 96-01-010
Keywords: optimize

napi@ms.mimos.my (Mohd Hanafiah Abdullah) writes:


> I wonder if there's any published report on the commonly employed
> optimization routines/procedures targeted for scalar-type RISC based
> applications such as:
>
> o register allocation
> o copy propagation
> o common subexpression optimization
> o peephole optimization
> o strength reduction
> o code motion of loop invariants
> o ...
>
> that is, regarding each one's contribution to the percentage of
> improvement on the target execution speed when turned on. It would be
> interesting to know which ones provides the highest improvement so
> that emphasis can be put on them. I tend to think in general that
> register allocation rates the highest, followed by peephole
> optimization and strength reduction.


Wow! If only things were that simple!
*EVERYTHING* interacts with everything else.
That is, the CPU/memory-subsystem/I-O system/OS/application ALL
interact with with above optimizations, which (of course) all
interact with each other. These interactions make compilers fun...


Fer instance: you are optimizing the inner kernal of a matrix multiply
on a modern RISC. Your compiler (at first) does NO optimizations,
just lives with naive code producde by the parser. You decide to do
strength reduction - some addressing multiplies become adds or shifts.
Did you dead-code-eliminate the adds? Your loop speeds up some, but
your loop is still full of loop invariant and dead code. Suppose you
see a 2% gain.


Instead you try hoisting loop invariants (no DCE yet). Maybe you get
10%. A lot of multiplies get hoisted. Now you strength reduce again,
but the multiplies got hoisted - little gain from SR.


Ok, start with CSE. Lots of common addressing code, so some 10% gain.
Cumulate it with invarient hoisting - what happens? A bunch of CSE's
get hoisted, so the total gain isn't 20% - except that the CSE's that
stay in the inner loop represent a larger fraction of dynamically
execution instructions. Whammo - another big gain.


You look at the inner loop, and see a darned integer multiply - after
hoisting and CSE the inner loop is getting kinda sparse. NOW that
multiply represents a large fraction of all executed instructions.
You try SR again and *bing* suddenly get another 20%.


In short, optimizations interact. Applied standalone you'll get one
number (perhaps a disappointing one). Applied in combinations, the
same optimization can be crucial in getting a 2x speedup.


Cliff
--
Cliff Click Compiler Researcher & Designer
RISC Software, Motorola PowerPC Compilers
cliffc@risc.sps.mot.com (512) 891-7240
http://members.aol.com/mjclick1
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.