Related articles |
---|
[3 earlier articles] |
Re: How many vector registers are useful? jlg@cochiti.lanl.gov (1993-01-26) |
Re: How many vector registers are useful? hyatt@cis.uab.edu (1993-01-27) |
Re: How many vector registers are useful? jrbd@craycos.com (1993-01-27) |
Re: How many vector registers are useful? hrubin@pop.stat.purdue.edu (1993-01-28) |
Re: How many vector registers are useful? sanjay@equalizer.cray.com (1993-01-29) |
Re: How many vector registers are useful? shubu@cs.wisc.edu (1993-01-30) |
Re: How many vector registers are useful? kurz@math.uni-frankfurt.de (1993-02-01) |
Newsgroups: | comp.sys.super,comp.arch,comp.compilers |
From: | kurz@math.uni-frankfurt.de (Volker Kurz) |
Followup-To: | comp.sys.super |
Keywords: | architecture, performance |
Organization: | University of Frankfurt/Main, Dept. of Mathematics |
References: | 93-01-174 |
Date: | Mon, 1 Feb 1993 15:00:03 GMT |
kirchner@uklira.informatik.uni-kl.de einhard Kirchner) writes:
> [is a large vector] register file useful at all ?
Definitely yes.
> A register has an optimizing effect only when the value in it can be used
> several times, at least twice, ...
>
> But how is this on vector machines ? The register creates a speedup only
> when it can hold an entire vector, which can be used again later. This
> requires a register long enough to do so. That means vectors of e.g. a
> length of 5000 can not be held anyway, every machine must load, process,
> and store it in pieces, and only a lot of memory bandwidth helps.
Every vector command introduces a new startup period. So if you have to
cut your original vector(s) into pieces that fit into a vector register,
it helps if you need fewer pieces. That is the advantage of configuring a
few very long registers.
> When configured as a few long vectors the Fujitsu vector registers may
> help, but then comes the second question: Are there any statistics on the
> reusing of vectors? I know about such things for scalar registers, where
> people found that 32 is plenty enough, and only 8 help a lot. But in these
> cases registers are used for loop indexes, addresses etc., which can not
> be compared to the use of vector registers.
>
> So: what can be gained with such a big vector register file ? Or is it
> only of limited help ? Can the register file be traded against bandwith to
> load and store from memory ?
Yes it can, and this may be the main reason why Fujitsu gave us such a
large register file.
If you configure more but shorter registers, than you have enough space to
keep intermediate results. This may be the most important advantage of a
large register file: to avoid memory traffic at all.
By keeping intermediate results in vector registers, you do increase
computational intensity which is defined as
number of arithmetic operations
-------------------------------
number of (main-)memory references
This has to be seen together with the number of data paths (max number of
memory references per pipe per cicle), which is 3 for a Cray Y-MP, 2 for a
VP1xxx (as you have in Kaiserslautern) and, alas, only 1 for a VP2xxx. As
a rule of thumb, a good estimate for an upper bound of the speed of an
arithmetic operation is
min{computational intensity * data paths, 1} * peak performance
A simple vector add has a computational intensity of 1/3, so it requires 3
data paths for full speed. This is the case on a Y-MP (at least
theoretically, you cannot get the full speed because of memory conflicts
with other processors). On a VP2xxx however you get only roughly 1/3 of
peak performance. On the latter machine, increasing computational
intensity has a dramatic impact on the sustained speed. In many cases
(among these is matrix multiplication) you can increas computational
intensity by unrolling outer loops. This is where a large number of
vector registers is very useful.
You can exploit this on your own machine fairly easily by using the
routines from level 2 BLAS and level 3 BLAS. To the best of my knowledge,
Kaiserslautern uses the routines that were optimized at the University of
Karlsruhe as part of the ODIN project.
Hope this helps,
Volker Kurz
--
Dr. Volker Kurz *** J. W. Goethe-Universitaet
kurz@math.uni-frankfurt.de *** Fachbereich Mathematik
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.