Re: Nitty-gritty aspects of register allocation

anton@mips.complang.tuwien.ac.at (Anton Ertl)
Fri, 11 Sep 2020 10:35:51 GMT

          From comp.compilers

Related articles
Nitty-gritty aspects of register allocation elronnd@elronnd.net (Elijah Stone) (2020-09-10)
Re: Nitty-gritty aspects of register allocation alexfrunews@gmail.com (Alexei A. Frounze) (2020-09-10)
Re: Nitty-gritty aspects of register allocation bo@bo-persson.se (Bo Persson) (2020-09-11)
Re: Nitty-gritty aspects of register allocation anton@mips.complang.tuwien.ac.at (2020-09-11)
| List of all articles for this month |

From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Date: Fri, 11 Sep 2020 10:35:51 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
References: 20-09-028
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="86722"; mail-complaints-to="abuse@iecc.com"
Keywords: code, optimize
Posted-Date: 11 Sep 2020 22:50:07 EDT

Elijah Stone <elronnd@elronnd.net> writes:
>There is a lot of research and a lot of resources on the high-level
>aspects of register allocation--colouring interference graphs, lifetimes,
>CFGs, etc.
>
>But there are a lot of low-level architecture-specific things that these
>models don't account for on their own. Most notably, not all registers
>can be used for the same things. Just on amd64:
>
>- Multiplication and division use the rdx:rax register pair.
>
>- Bit shifts always use cl.
>
>- When calling functions, their arguments have to go in specific registers.
>
>In all of these cases, you can spill whatever happens to already be in
>those registers, but it would be nicer if you could arrange for the
>appropriate values to already be in the right places. But how?


A classical approach (at least for graph colouring) is to have a
register coalescing phase before the actual register allocation.
Register coalescing tries to eliminate move instructions by combining
the two live ranges connected by the move into one live range (i.e.,
the same register is used for both). If you have that, you can insert
move instructions before doing the coalescing and register allocation,
and coalescing will eliminate them if possible, and will keep them if
not.


So, for the problem with fixed registers you create live ranges for
the fixed registers, and insert a move from another live range A to
the fixed live range X just before the value in A is needed in the
fixed register, and a move from the fixed live range Y to another live
range B just after a value is put into a fixed register (by an
instruction or by the calling convention). Coalescing will eliminate
as many of these moves as it can; if it can eliminate both of the
moves in the examples, A is in the register X and B is in the register
Y, with no moves necessary.


> But that's not a general solution
>(imagine if every register has a few pieces of unique functionality)


The technique above is a workaround for relatively rare cases; it
probably does not work so well for an instruction set like the 8080,
where every register has a special purpose. That's why we have seen
architectures with general-purpose registers ("register machines")
whenever it was expected that most of the code is generated by
compilers rather than assembly programmers. That includes IA-32 and
AMD64, where the instructions with implicit registers (like XLAT or
LODS) were superseded by faster implementations of general
instructions (e.g., on the 486 LODS takes 5 cycles, while the general
replacement takes 2 cycles and has no register requirements).


>- Some registers need a special prefix to be used (REX prefix). These
> registers are generally different from the special-purpose registers
> (for e.g. multiplication). Is it better to put a non-multiplied value
> in a REX-prefixed register, or keep it in an unprefixed register and
> spill it later when you need to multiply?


Coalescing should answer that.


One other aspect of REX is that it is necessary to get 64-bit width
for many instructions. So if you have a 64-bit data type, you can
just as well put it into R8-R15, because you often need a REX anyway.
I would do this with preferencing (i.e., in the register allocation
phase, when you have to choose a register and one in the preferred
class is available, choose it).


>- The second-lowest 8 bits of some registers can be addressed
> separately. When does it make sense to use them?


Probably never. The 16 other 8-bit registers should be enough; and
even if you want to use them, I recommend checking if using AH BH CH
DH is not a performance pitfall.


- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.