|optimizing for caches firstname.lastname@example.org (Richard Cownie) (1992-11-17)|
|Re: optimizing for caches email@example.com (1992-11-19)|
|Re: optimizing for caches firstname.lastname@example.org (1992-11-19)|
|Re: optimizing for caches email@example.com (1992-11-21)|
|Re: optimizing for caches firstname.lastname@example.org (1992-11-26)|
|optimizing for caches email@example.com (Richard Cownie) (1992-12-01)|
|From:||Richard Cownie <firstname.lastname@example.org>|
|Date:||Tue, 17 Nov 1992 23:30:08 GMT|
>From experiences tuning vector routines for i860's and SPARC's, I've come
to the conclusion that understanding and exploiting the memory hierarchy
is essential to obtain good performance on these kinds of problems. But I
have yet to see a compiler which tackles this aspect of optimization.
Does anyone have references to research in this area ? If so, please mail
me and I'll summarize.
With the way hardware is developing, this will be a big issue very soon.
Here are some approximate figures to illustrate the trend:
Year Machine MIPS rate Cache miss (DRAM access)
1988 Sun-4/110 7 200ns ?
1991 SS-2 Cypress 25 120-150ns ?
1993 SS-10 Viking 50 100-120ns ?
1994 ? 150 80-100ns ?
So the relative cost of a cache miss has already risen from about 1.4
instructions to > 5 instructions, and the Viking clock speed is still only
40MHz; the technology exists now to build processors running at 150MHz
(e.g. Alpha), which will take the cost of a cache miss over 20
It seems really important to get this right - if you get your instruction
scheduling wrong, you might still see 25% of optimum performance, but if
you don't exploit the memory hierarchy, you might see only 5% of optimum
Thanks in advance for all responses,
Richard Cownie (a.k.a. Tich), Meiko Scientific Corp
Return to the
Search the comp.compilers archives again.