Re: GPU-aware compiling?

"Michael Tiomkin" <tmk@netvision.net.il>
22 May 2005 00:57:08 -0400

          From comp.compilers

Related articles
GPU-aware compiling? mangoo@interia.pl (Tomasz Chmielewski) (2005-05-20)
Re: GPU-aware compiling? tmk@netvision.net.il (Michael Tiomkin) (2005-05-22)
Re: GPU-aware compiling? schummi@i.com.ua (Oleg V.Boguslavsky) (2005-05-22)
Re: GPU-aware compiling? scooter.phd@gmail.com (scooter.phd@gmail.com) (2005-05-24)
Re: GPU-aware compiling? rgd00@doc.ic.ac.uk (Rob Dimond) (2005-05-24)
Re: GPU-aware compiling? hannah@schlund.de (2005-05-24)
Re: GPU-aware compiling? bear@sonic.net (Ray Dillinger) (2005-06-26)
Re: GPU-aware compiling? der_julian@web.de (Julian Stecklina) (2005-07-02)
| List of all articles for this month |

From: "Michael Tiomkin" <tmk@netvision.net.il>
Newsgroups: comp.compilers
Date: 22 May 2005 00:57:08 -0400
Organization: http://groups.google.com
References: 05-05-184
Keywords: architecture, performance
Posted-Date: 22 May 2005 00:57:08 EDT

Tomasz Chmielewski wrote:
> Recently I've been reading about General-Purpose Computation Using
> Graphics Hardware - http://www.gpgpu.org - and it seems that GPUs can
> bring quite a good performance when compared to the CPUs.
>
> In other words, a graphics chip on the graphics card can make really
> heavy computations, and it's easier and cheaper to buy a couple of
> top-performance graphisc cards than to buy a multi-CPU machine (which
> are quite expensive).


    Well, most of modern CPUs (and GPUs) are multiunit processors, i.e.
similar to a multi-CPU machine, and every modern compiler takes this
into account. Second, from the point of view of performance it's more
cost-effective to buy a lot of low-cost graphic cards or CPUs than a
couple of top-performance ones. The only problems are availability and
price of very fast buses and networks, and good and efficient parallel
programming languages.


    On most of current PCs, the fastest bus is AGP or PCI Express, but
it usually allows only one graphic card to be attached. This means
that at this time the only cheap solution is to use a couple of PCI
graphic cards, and only one GPU connected to AGP or
PCIExpress. Motherboards with multiple AGP/PCIex buses are much more
expensive.


    The second problem is that GPUs are heavily oriented towards
floating point computations with small precision which is needed for
3D. This can help with solving differential equations, weather
prediction etc., but it will be much more difficult to compute
discrete algorithms on these processors.


    The third problem is that most of graphic card manufacturers don't
allow you to bypass their drivers and download code that can run on a
GPU. This presents a security problem for the system, and also
compromises the graphic performance of the cards.


> Do you think - theoretically - that a compiler could help compiling
> software, which would in turn use the power of the GPU to make some
> of the computations?
>
> Like now we have compiler options like "-mmmx -msse -msse2 -msse3
> -m3dnow" - would it be possible to optimize the code of the binary to
> use the GPU with "-with-nvidia-gpu" or "-with-ati-gpu"?
>
> I would like to hear some theoretical discussion about that.


    This is called "cross-compilation" and this is possible on most
compilers. You'll also need the parts of code that download the code
into the card, allow activation of the code from your CPU, and get the
results from the card. This is doable, and I think we had a discussion
on this issue a couple of years ago. The OEM manufacturers of Nvidia
or ATI cards can definitely include this into their drivers, together
with some "sandbox" system to prevent malfunctioning of the card.


  The only support from OS that you need is the possibility to allocate
some video buffers to your process, and to read/write them without
interferencing too much with the card.


    Unfortunately, it's worthwile to run only sufficiently large parts
of the code on a GPU because of the price of communication with the
board. MMX/SSE allow much faster communication, and the interface is
included as a part of instruction set.


    Using GP is similar to running a parallel algorithm on an MP machine
with some of the memory shared between the processors, and the reason
is that the video buses are built for better memory transfer and not
message passing. MPs with shared memory is a huge area with hundreds
research papers. In your case you'll have only two heavily connected
processors.


    Michael


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.