26 Jan 2006 13:32:01 -0500

From: | George Neuner <gneuner2@comcast.net> |

Newsgroups: | comp.compilers |

Date: | 26 Jan 2006 13:32:01 -0500 |

Organization: | Compilers Central |

References: | 06-01-037 06-01-069 |

Keywords: | analysis |

Posted-Date: | 26 Jan 2006 13:32:01 EST |

On 20 Jan 2006 16:12:15 -0500, "Oliver Wong" <owong@castortech.com>

wrote:

*> ... after doing some more reading on DFA, ...*

Be careful with your terminology ... "DFA" normally stands for

"deterministic finite automaton". Data flow analysis is not generally

abbreviated.

*>Some of the articles I've read seem to imply*

*>that I should already have a control flow graph before starting the*

*>analysis, and other articles seem to imply that this the CFG should be built*

*>up in parallel with the analysis. Can someone clear up what the proper*

*>ordering is, or where I can go to read more about this?*

It's easier if you construct the basic blocks and the control graph

first. Some analyses require chasing loops through the graph and

those will be complicated greatly by trying to build the graph on

demand as you perform them. Additionally some analyses are more

easily done by walking the graph backward. Everything will require

[at some point] examining the individual operations in the basic

blocks.

*>In my current implementation, there is no explicit CFG being built at all.*

I think the problem is that we don't really understand what you are

trying to do. You said in your original post:

"For example, in the following statement: x = y + 5 + z;

the value written to 'x' depends on both the value of 'y'

and the value of 'z'."

and in your last post:

"This is what I now do. The output is something like:

LeftAssignmentOccurrence: "x":21:5:Write

Occurences Participating in assignment value:

Occurrence: "y":21:14:Read

Occurrence: "z":21:19:Read "

It seems like the question you are really trying to answer is: "where

did the value of X come from". For that you are going to have to get

down and dirty with less abstract representations.

Consider the following:

if ( ... )

x = y + 1;

else

x = z - 2;

w = x * 3;

What does the value of 'w' depend on? 'x' of course ... but which

'x'? The one that depends on 'y' or the one that depends on 'z'? Is

that important to your analysis? What if that code were part of a

loop and 'y' or 'z' depended on the redefinition of 'w'?

Conditionals and loops (among other things) are why people are

suggesting you look into value numbering and SSA. After either

transformation you will have something like:

if ( ... )

x{1} = y{1} + 1;

else

x{2} = z{1} - 2;

=> X{3} = (x{1} | x{2}) <=

w{1} = x{3} * 3;

where each use of a symbol is subscripted. It's similar enough to the

original code that you can read and understand it but is much easier

to analyze. The additional line defining 'x{3}' is a pseudo operation

called a phi function which is used to merge conflicting definitions -

in this case it means 'x{3}' may take the value of either 'x{1}' or

'x{2}'. Phi functions are needed for analysis only and don't normally

produce any real code ... ie. there is still just one 'x' in the

program, but now we can track how it's values change and how they are

used.

In fact, given what you have said previously, a value numbered source

listing would directly answer the questions of what is read and

written provided the depiction of the symbol subscripts was clear to

the human reader.

If you can explain better what you are really trying to achieve then

we can probably help you more.

George

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.