|Register Allocation and Instruction Scheduling firstname.lastname@example.org (1994-03-23)|
|Register allocation and instruction scheduling email@example.com (1999-01-17)|
|Re: Register allocation and instruction scheduling firstname.lastname@example.org (David A. Greene) (1999-01-19)|
|Re: Register allocation and instruction scheduling email@example.com (1999-01-20)|
|Re: Register allocation and instruction scheduling firstname.lastname@example.org (David A. Greene) (1999-01-22)|
|Re: Register allocation and instruction scheduling email@example.com (1999-01-25)|
|Re: Register allocation and instruction scheduling firstname.lastname@example.org (1999-01-27)|
|Re: Register allocation and instruction scheduling email@example.com (1999-01-27)|
|Re: Register allocation and instruction scheduling firstname.lastname@example.org (1999-01-31)|
|Re: Register allocation and instruction scheduling email@example.com (1999-02-01)|
|From:||firstname.lastname@example.org (Bob Morgan)|
|Date:||20 Jan 1999 12:47:50 -0500|
|Organization:||Digital Equipment Corporation|
On 17 Jan 1999 20:46:33 -0500, email@example.com (Chris) wrote:
>1) How to take into account register renaming by the hardware by the
>compiler: especialy when using graph coloring techniques?
>2) How to take into account hardware instruction scheduling/reordering
>when trying to do different types of compiler instruction optimzation.
>Do I have to take into account register renaming and instruction
>reordering when trying to design an optimized compiler?
>Doesn't the hardware essentially undo all of my hard work in trying to
>write an optimized compiler? The book doesn'talk about these topics.
Hardware Register Renaming solves one problem and introduces another.
It solves the problem of instruction overlapping when there is a reuse
Load X into R1
Store R1 into Y
Load U into R1
Store R1 into Z
Register renaming allows the computer to start the execution of the
Load of U before either the Load or Store of X into Y is completed.
The problem is: there are still a finite number of physical registers.
When all of those registers are in use through register renaming then
the processor must stall until some registers become available.
If you are only mildly concerned with optimization you can ignore this
problem. If you are trying to use the processor as efficiently as
possible then the compiler must try to keep an approximation of how
many of these physical registers are available (precision is probably
impossible). Then the compiler should schedule to avoid complete use
of the physical register set (among other things).
A good way to view the back end of a compiler is that it symbolic
simulates the processor, emitting instructions as it goes to insure
the same effect (in reality - not physically) when the program
executes. This includes simulating register use (register allocation),
order of execution (instruction scheduling) and the use of the
physical registers. The simulation should reorder instructions to
insure that all computations can be done on the target machine (i.e.
you don't use too many registers or whatever other resource is
available on the machine) and to minimize the overuse of
resources(which will cause a processor stall).
To your general question: What optimizations should be done on an
out-of-order processor. First there has to be some order (randomness
is not effective). Many of the out-of-order processors have in-order
retirement of instructions. This means that there is some buffer in
the processor keeping all of the instructions which have been issued
but not retired. Instructions are removed from the buffer after they
have been completed and all instructions issued before them have
retired. The idea of retirement is needed if a portion of the
execution must be restarted (replays).
Given that this buffer exists I guess that one wants to schedule
instructions to encourage early retirement of all instructions. This
will keep the buffer from filling up and give the best performance.
One part of doing this is scheduling instructions which usually have a
fixed length of execution in the same way that RISC processors
schedule them. The out of order aspects of the processor are used to
handle variable length instructions such as loads.
Return to the
Search the comp.compilers archives again.