Small values in large registers (char,short,...)

"Jakob Engblom" <jakob.engblom@iar.se>
2 Dec 1997 12:08:31 -0500

          From comp.compilers

Related articles
Small values in large registers (char,short,...) jakob.engblom@iar.se (Jakob Engblom) (1997-12-02)
Re: Small values in large registers (char,short,...) henry@zoo.toronto.edu (Henry Spencer) (1997-12-05)
Re: Small values in large registers (char,short,...) jacob@jacob.remcomp.fr (1997-12-07)
| List of all articles for this month |

From: "Jakob Engblom" <jakob.engblom@iar.se>
Newsgroups: comp.compilers
Date: 2 Dec 1997 12:08:31 -0500
Organization: IAR Systems
Keywords: code, architecture, question
Comments: Authenticated sender is <jakob@mailhost.iar.se>

Hi!


I am presently investigating implementating a C compiler for a "pure"
32-bit architecture. A problem which I have found is that when you
store chars and shorts (8 and 16 bit values) in the 32-bit registers,
you will have to look out for overflows in certain situations.


In most cases, you can simply go ahead and calculate as ordinary
without worrying: an add is still an add, and any garbage in the high
bits can be safely ignored, as long as they are masked before storing
back into memory in some place.


So far, I have identified the following problematic areas:


* Arithmetic operations which do not distribute with modulo.
    (char arithmetic can be considered to be ordinary arithmetic modulo
      256). Typical examples are right shift and division.


    For example. (x DIV y)%256 != (x % 256) DIV (y % 256)


    try x=256, y=16: the first op gives 16, the second 0.


    Note that 256 is not a valid char value, and this the
    problem:


    WHEN DO YOU HAVE TO MASK AWAY EXCESS BITS (or signextend)?
    A few more examples of problematic situations follow below:


* Tests.
    In tests, the wraparound effect in chars should work as expected.


    Otherwise,
    r = 0x80 t=0x01
    s = 0x81 u=0x00
    r+s = 0x101 u+t=0x01


    r+s == u+t is NOT true. Which it should be.


* Array indexing.
    a[r] ... if r is in a register a little to big, indexing
    by a CHAR value can lead to index values of 500 or so... not
    very good.


Note that this is not a problem on architectures like the 68k, where
there are dedicated byte, word and long operations. On the other
hand, on the SPARC, there is a problem.


Some experimentation with gcc and cc for SPARC showed that there
seems to be at least two different tactics:
- mask after EVERY operation, to keep the value within bounds.
- mask before dangerous operations.




My main question: are there any good books, articles, or common
knowledge regarding this question? Most compiler literature I have
seen is either too old or deal with too high-level issues to care
about this "simple" problem.


Grateful for all help I can get!


/jakob




----------------------------------------------------
Jakob Engblom, System Developer & PhD Student
IAR Systems, Uppsala, Sweden
e-mail: jakob.engblom@iar.se
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.