Re: How to eliminate redundant constant move instructions

George Neuner <gneuner2@comcast.net>
Fri, 04 Nov 2011 17:26:29 -0400

          From comp.compilers

Related articles
[9 earlier articles]
Re: How to eliminate redundant constant move instructions gneuner2@comcast.net (George Neuner) (2011-11-02)
Re: How to eliminate redundant constant move instructions acolvin@efunct.com (mac) (2011-11-03)
Re: How to eliminate redundant constant move instructions kaz@kylheku.com (Kaz Kylheku) (2011-11-03)
Re: How to eliminate redundant constant move instructions gah@ugcs.caltech.edu (glen herrmannsfeldt) (2011-11-03)
Re: How to eliminate redundant constant move instructions gneuner2@comcast.net (George Neuner) (2011-11-04)
Re: How to eliminate redundant constant move instructions gah@ugcs.caltech.edu (glen herrmannsfeldt) (2011-11-04)
Re: How to eliminate redundant constant move instructions gneuner2@comcast.net (George Neuner) (2011-11-04)
Re: How to eliminate redundant constant move instructions amker.cheng@gmail.com (amker) (2011-11-07)
Re: How to eliminate redundant constant move instructions chenwj@cs.NCTU.edu.tw (Wei-Jen Chen) (2011-11-10)
Re: How to eliminate redundant constant move instructions gneuner2@comcast.net (George Neuner) (2011-11-10)
| List of all articles for this month |

From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Fri, 04 Nov 2011 17:26:29 -0400
Organization: A noiseless patient Spider
References: 11-10-019 11-11-004 11-11-009
Keywords: optimize
Posted-Date: 06 Nov 2011 23:49:22 EST

On Tue, 1 Nov 2011 19:21:54 -0700 (PDT), amker <can.finner@gmail.com>
wrote:


>On Nov 2, 2:32 am, George Neuner <gneun...@comcast.net> wrote:
>> On Mon, 31 Oct 2011 17:53:46 +0800, "Amker.Cheng" <amker.ch...@gmail.com> wrote:
>> >I found following intermediate codes are generated in gcc
>>
>> >rx <- 0
>> >...
>> >use rx
>> >...
>> >ry <- 0
>> >use ry
>> >...
>>
>>
>... the test case showed at:
>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025
>which is actually a reported gcc bug.


Ok. That bug was reported for ARM7 thumb code. I'm not familiar with
ARM assembler (had to look up the instructions), but I do see that R0
already holds the proper value when R3 is loaded - which I presume is
what you are talking about.


But I don't necessarily agree that it is a bug. It's definitely not
optimal code, but R0 and R3 represent different temporary variables,
so it's hard to see how any of value numbering, constant propagation
or register allocation is at fault.




>> GCC doesn't really perform a separate constant propagation
>> optimization ... instead it generically tracks use of values to (try
>> to) minimize redundant loads. There is a separate optimization,
>> -fcprop-register, which is a peephole pass that eliminates redundant
>> register moves (introduced by other optimizations), but that is
>> performed after register allocation.
>
>Yes, I have noticed this pass. Seems it can solve the problem if I
>can:
>1, extend the pass in value numbering way, at least for const values.
>2, extend the pass in global data analysis way.


Yes. But value numbering doesn't know or care what specific value is
in the register - it cares only whether the value in the register
might have been changed between read uses.


Variable value tracking is the related technique used for constant
propagation. However what you are after is variable equivalence
testing - I don't think GCC even tries to do this.




>> You might be asking "if the value already is in a register, why not
>> just use it rather than load a second register?" The answer to that
>> likely is a scheduling issue which depends on the use of the first
>> register. You have to remember that many CPUs can execute multiple
>> instructions in parallel, and those parallel instruction streams may
>> be executed out of order with respect to a program listing.
>>
>> On most CPUs loading an immediate constant is as cheap as a register
>> move. Also, loading a constant ties up only the target register
>> whereas a move ties up both target and source registers.
>
>Yes, This is the point I missed. So even if the codes are optimized into:
>
>rx <- 0
>...
>use rx
>...
>use rx
>
>It might be worse than original codes considering out-of-order and
>parallel execution. In this way, how should I know for sure which case
>is better?


You can't know without trying it. If the target CPU is known, the
compiler could simulate execution of the code sequence to try out
logically equivalent approaches. But most CPU manufacturers don't
even publish instruction times any more - never mind providing enough
information to accurately simulate the chip (emulation is not
simulation).


George


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.