|[4 earlier articles]|
|Re: Stack based machines email@example.com (2000-03-25)|
|Re: Stack based machines firstname.lastname@example.org (2000-03-25)|
|Re: Stack based machines email@example.com (2000-03-25)|
|Re: Stack based machines firstname.lastname@example.org (Deepak Janardhanan) (2000-03-25)|
|Re: Stack based machines email@example.com (Philip Koopman) (2000-03-25)|
|Re: Stack based machines Keith@wootten.demon.co.uk (Keith Wootten) (2000-03-28)|
|Re: Stack based machines firstname.lastname@example.org (Bernd Paysan) (2000-03-28)|
|Re: Stack based machines email@example.com (Deepak Janardhanan) (2000-03-28)|
|From:||Bernd Paysan <firstname.lastname@example.org>|
|Date:||28 Mar 2000 01:04:43 -0500|
|Organization:||Customer of UUNET Deutschland GmbH, Dortmund, Germany|
|References:||00-03-101 00-03-124 00-03-130|
> In particular, FXCH (almost) always pairs with other instructions, and
> allows one to (almost) emulate a register based architecture. In
> particular, A*B+C*D+E*F can be rewritten to pipeline the three
> multiplications so they run in parallel and are joined at the end.
> Not exactly a conventional stack-based architecture...
You get a stack notation when you do a *sequential* tree walk on an
expression. It is not surprising that the resulting code is sequential,
even if the tree (or graph) contained ILP. Since stack notation and
trees are just two expressions of the same thing, you certainly can
extract the original tree out of the stack notation, and reparallelize
it. Not exactly the most cost-efficient solution.
Piercarlo Grandi had the idea to use several stacks, which allow to
schedule more than one stack operation in parallel. I went further down
this path, and came up with my 4stack architecture. The main idea still
is to provide enough execution units for the sweet spot of ILP (4 ALU
ops per cycle was assumed to be the "sweet spot"), and to take advantage
of the overall simpler structure of a stack machine.
It certainly takes the same sophisticated compilers to produce good code
for it, and since now GCC got important pieces like SSA graph
generation, I can look again on producing a compiler for it (more than
the prototype, which assumed that insns are already available in SSA
form, and then did some rather trivial scheduling; mostly a parallelized
version of the tree walk).
Return to the
Search the comp.compilers archives again.