ABI & alignement: IA32

Laurent Guerby <guerby@acm.org>
30 Nov 2000 12:06:35 -0500

          From comp.compilers

Related articles
ABI & alignement: IA32 guerby@acm.org (Laurent Guerby) (2000-11-30)
Re: ABI & alignement: IA32 rkrayhawk@aol.com (2000-12-20)
| List of all articles for this month |

From: Laurent Guerby <guerby@acm.org>
Newsgroups: comp.compilers
Date: 30 Nov 2000 12:06:35 -0500
Organization: Club-Internet (France)
Keywords: architecture, performance
Posted-Date: 30 Nov 2000 12:06:35 EST

On modern IA32 implementations, it is very important that doubles
(8-byte floats) are 8-byte aligned for performance reasons, as the
following C code shows:


$ cat dbl.c
#include <stdio.h>
#define N 10000
int main (int argc, char **argv) {
    double *x;
    int i, j;


    x=(double*)malloc((N+1)*sizeof(double));
    if(argc==2) x=(double*)((int)x+4);


    printf("%d\n", (int)x%8);
    for(i=0;i<N;i++) x[i]=(double)i;
    for(i=0;i<N;i++) for (j=0;j<N;j++) x[i]=0.5*(x[j]+x[i]);


    printf("%f\n", x[N-1]);
    exit(0);
}
$ gcc -O2 dbl.c; time ./a.out; time ./a.out unaligned
0
9998.574083
3.37user 0.00system 0:03.79elapsed 88%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (97major+29minor)pagefaults 0swaps
4
9998.574083
5.67user 0.00system 0:06.38elapsed 88%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (97major+29minor)pagefaults 0swaps
$


Run time is 3.37s on a P2 350MHz when the array is 8-byte aligned, and
5.67s when the array is not 8-byte aligned.


However the IA32 ABI says that for double on the stack, the alignement
should be 4-bytes. My question is: does such a requirement implies
that a compiler that pads the stack (or whatever) to get an 8-byte
alignment is not ABI compliant (8-bytes aligned is obviously also
4-bytes aligned ;-)?


The GCC manual says in its i386 section:


<<
`-malign-double'
`-mno-align-double'
          Control whether GNU CC aligns `double', `long double', and `long
          long' variables on a two word boundary or a one word boundary.
          Aligning `double' variables on a two word boundary will produce
          code that runs somewhat faster on a `Pentium' at the expense of
          more memory.


          *Warning:* if you use the `-malign-double' switch, structures
          containing the above types will be aligned differently than the
          published application binary interface specifications for the 386.
>>


I'm curious about what people think about this ABI-conformance issue,
and in particular what the Intel compiler does by default in
optimizing mode when it encounters a stack array of double?


Thanks for any information,
--
Laurent Guerby <guerby@acm.org>
[I looked it up, I was mistaken when I said that you only need 4 byte
alignment -- the Pentium manuals more or less say that each type should
be aligned on a natural 2^n boundary. -John]





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.