Re: Q: P6 branch prediction

krste@ICSI.Berkeley.EDU (Krste Asanovic)
29 Apr 1996 23:32:05 -0400

          From comp.compilers

Related articles
Re: Q: P6 branch prediction krste@ICSI.Berkeley.EDU (1996-04-29)
Re: Q: P6 branch prediction conway@rimmer.cs.mu.OZ.AU (1996-05-01)
Re: Q: P6 branch prediction khays@sequent.com (1996-05-14)
| List of all articles for this month |

From: krste@ICSI.Berkeley.EDU (Krste Asanovic)
Newsgroups: comp.arch,comp.compilers
Date: 29 Apr 1996 23:32:05 -0400
Organization: International Computer Science Institute, Berkeley, CA, U.S.A.
References: <3179B05D.2781@cs.princeton.edu> <3183DA6A.1DB8@hda.hydro.com>
Keywords: architecture

Terje Mathisen <Terje.Mathisen@hda.hydro.com> writes:
|> Code cache locality is a good reason for a compiler to handle a return
|> statement at once, instead of generating a common tail end for all the
|> possible exit points in a function. (Having a trace-feedback driver
|> compiler would also help, of course!)


As long as the return code is short. If the function exit prologue
has to restore many registers, then repeating this code at each exit
point might adversely increase the cache footprint of the routine and
hence might impact performance.


A related often-missed optimization is delaying the build of a stack
frame until it is certain it is needed. I find I often write routines
that would be leaf routines except for a single conditional call to an
error handling routine, oftentimes due to an "assert". If the stack
frame build could be moved inside the condition, the usual case could
run faster.


E.g.,


        int
        foo(int x)
        {
                if (x >= max_x)
                {
                        fprintf(stderr, "X too big.");
                        exit(EXIT_FAILURE);
                }


                return x + y * 87 + 3;
        }


When all function calls are inside a single conditional, it would seem
to make sense to delay building a stack frame until inside the
conditional. Though this probably messes up debugging information, it
would be nice to have as part of a higher optimization level.


Can any current compilers do this optimization?


--
Krste Asanovic phone: +1 (510) 642-4274 x143
International Computer Science Institute fax: +1 (510) 643-7684
1947 Center Street, Suite 600 email: krste@icsi.berkeley.edu
Berkeley, CA 94704-1198, USA http://www.icsi.berkeley.edu/~krste
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.