Re: compilers using MMX instructions in the generated code

andi@complang.tuwien.ac.at (Andreas Krall)
15 Jan 2000 14:18:59 -0500

          From comp.compilers

Related articles
[2 earlier articles]
Re: compilers using MMX instructions in the generated code jkahrs@castor.atlas.de (Juergen Kahrs) (2000-01-09)
Re: compilers using MMX instructions in the generated code Milind.Girkar@intel.com (Milind Girkar) (2000-01-09)
Re: compilers using MMX instructions in the generated code plakal@cs.wisc.edu (2000-01-09)
Re: compilers using MMX instructions in the generated code lindahl@pbm.com (2000-01-12)
Re: compilers using MMX instructions in the generated code olefevre@my-deja.com (2000-01-12)
Re: compilers using MMX instructions in the generated code mlross@jf.intel.com (2000-01-12)
Re: compilers using MMX instructions in the generated code andi@complang.tuwien.ac.at (2000-01-15)
Re: compilers using MMX instructions in the generated code bcombee@metrowerks.com (2000-01-19)
Re: compilers using MMX instructions in the generated code lindahl@pbm.com (2000-01-19)
Re: compilers using MMX instructions in the generated code a.richards@computer.org (Andrew Richards) (2000-01-23)
Re: compilers using MMX instructions in the generated code pica67@my-deja.com (Carsten Pitz) (2000-01-25)
Re: compilers using MMX instructions in the generated code andrew@bhjz.demon.co.uk (Andrew Richards) (2000-02-04)
| List of all articles for this month |

From: andi@complang.tuwien.ac.at (Andreas Krall)
Newsgroups: comp.compilers
Date: 15 Jan 2000 14:18:59 -0500
Organization: Vienna University of Technology, Austria
References: 00-01-017 00-01-031
Keywords: architecture, optimize

lindahl@pbm.com (Greg Lindahl) writes:
> You also didn't mention the alignment restrictions. Isn't it the case
> that the inputs need to be 128-bit aligned? So if you don't know the
> alignment at compile-time, or the alignment happens to be unfortunate,
> you're out of luck. Consider:
>
> short int a1[50], a2[50], b[50], c[50];
>
> void foo(void)
> {
> int i;
> for (i = 0; i < 50; i++) a1[i] = b[i] + c[i];
> for (i = 0; i < 49; i++) a2[i] = b[i+1] + c[i];
> }
>
> In this example, I don't think you can vectorize both loops.


It is possible to vectorize both loops. Our prototype compmiler for
the SPARC VIS can handle this case (with a little bit support from the
hardware). The SPARC has support for unaligned loads where only three
instructions are necessary for an unaligned load (2 loads and a
merge). Similar code can be emitted for processors without support by
shifts and logical or. The prolog and epilog of the loop needs special
handling. Inside the loop it only one load is necessary because the
second value from the previous iteration can be used. So loop peeling
is necessary, afterwards vectorization can be applied.
--
andi@complang.tuwien.ac.at Andreas Krall
http://www.complang.tuwien.ac.at/andi/ Inst. f. Computersprachen, TU Wien
tel: (+431) 58801/18511 Argentinierstr. 8/4/1851
fax: (+431) 58801/18598 A-1040 Wien AUSTRIA EUROPE


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.