The shortest way with programs (was Compile HLL to microcode) (David L Moore)
18 Apr 1996 00:38:01 -0400

          From comp.compilers

Related articles
Re: Compile HLL to microcode on VLIW - possible? (1996-04-10)
Re: Compile HLL to microcode on VLIW - possible? (1996-04-13)
Re: Compile HLL to microcode on VLIW - possible? (1996-04-16)
The shortest way with programs (was Compile HLL to microcode) (1996-04-18)
Re: The shortest way with programs (was Compile HLL to microcode) krste@ICSI.Berkeley.EDU (1996-04-19)
| List of all articles for this month |

From: (David L Moore)
Newsgroups: comp.arch,comp.compilers
Date: 18 Apr 1996 00:38:01 -0400
Organization: Netcom
References: 96-04-059 96-04-083 96-04-094
Keywords: architecture, performance

> And that's the whole deal. With instruction caches, you can make the
> hardwired instruction cycle as fast as microcode. There's no reason
> to have a separate level of instruction interpretation.

Except, of course, that you can't - because the microcode was only
instantiated once but equivalent sequences of instructions will be
instantiated often, so you need much more of your highest speed

One of the other features of RISC was, of course, that the silicon
freed up by simplifying the instruction set could be used for more
registers. Modern CISC chips use REGISTER RENAMING to achieve the same
thing. That is, even though you think you only have 6 useful
registers, you really have more because when you assign to a register,
it is given a new name in the register file.

Warning - possible hair-brained idea ahead.

Now if you took this renaming hardware and applied it to a RISC chip,
you could perform outlining (the opposite optimisation to inlining).

Something like outlining has been used for years in EXTRACODE routines
- when you execute an instruction that is not implemented on your
computer because your computer centre did not have the money to buy
that instruction, it executes a subroutine.

One of the problem with such routines is that the first thing you have
to do is work out where the operands are, move them somewhere
standard, do the work, and then move the result where it needs to
go. For short pieces of code, this takes too long.

Trying to ensure that the same registers are always used so you can
outline a given common code sequence is likely to cause too many
performance hits, especially when you try to use many such sequences.

A register renamer could allow you to outline agressively and so
improve the I-cache performance of code by allowing the operands to be
anywhere without you having to do any extra work in the outlined
code. You would do all operations in the extra-code using the names
given to the operands by the renamer - that is the renamer binds
formal registers to actual registers. You would want out-of-order
instruction execution too, I should think.

In the first instance, one could try such a scheme on procedure calls
in a software emulator. Instead of putting the parameters in fixed
registers and getting the result back in another fixed register you
allow the hardware register renamer to bind the parameters for
routines. With modern styles of programming, this is similar to
outlining - except you simply would avoid inlining all those short
method routines that normally get inlined to save procedure call

Of course, this makes the renamer rather more complicated - perhaps
too complicated. You also want your I-cache to have lots of small
lines rather than fewer longer lines, so you could run into
fanout/gate delay difficulties on your I-cache associater. (I must
admit to having no idea what fan-outs are currently possible)

David Moore

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.