parallelizing loops on a DMA (Edward Walker)
Fri, 26 Aug 1994 06:00:10 GMT

          From comp.compilers

Related articles
parallelizing loops on a DMA (1994-08-26)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Edward Walker)
Keywords: experiment, parallel, question
Organization: National University of Singapore
Date: Fri, 26 Aug 1994 06:00:10 GMT

Further to my recent post with my request to use a parallel machine
to conduct some experiments. I would like to extend my request
for some time on a *distributed memory machine* as well.

The purpose of my experiments is to try and derive an optimal
parallel form for the hypothetical loop (first introduced in [1])

      FOR i = 1, n
            FOR j = 1, m
                  FOR k = 1, p
                            A(i+j,3*i+j+3) = ...
                            ... = A(i+j+1,i+2*j+4) ...

using the dataflow information I generate for the definition and
reference of the elements in the array A(). At the moment, I can
generate the pure dataflow (i-j axis) version of the above loop by
deriving the distance vector of the above data dependence. What I
need to find out is whether the additional overhead in synchronizing
for all the data dependencies will overwhelm any benefits incurred.
So essentially, I would like to do multiple runs with varying (n,m,p)
on (possibly) different dataflow versions of the above kernel.

The hope (fantasy?) of course is to one day allow a parallelizing
compiler to generate the dataflow form automatically.

Many thanks again.

- edward

[1] T. H. Tzen and Lionel Ni, "Dependence Uniformization: A Loop
Parallelization Technique", IEEE Trans. Parallel and Distributed
Systems, vol. 4, no. 5, 1993, pp. 547--558.
Edward Walker
National Supercomputing Research Centre
81, Science Park Drive
#04-03, The Chadwick
Singapore Science Park
Singapore 0511

tel: (65)-7909-226


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.