5 Jul 1999 13:43:26 -0400

Related articles |
---|

Rounding with Div and Mod operators - a summary william.rayer@virgin.net (William Rayer) (1999-07-05) |

From: | "William Rayer" <william.rayer@virgin.net> |

Newsgroups: | comp.compilers |

Date: | 5 Jul 1999 13:43:26 -0400 |

Organization: | Virgin News Service |

Keywords: | arithmetic, summary |

Dear Newsgroup

In May I posted a question about the best way of rounding when using

the Div and Mod operators. I want to thank everyone who responded,

particularly Jonathan Barker who explained the expected mathematical

behaviour of these operators.

I'm submitting these notes to the newsgroup in case it helps anyone

else, and I'm happy for them to be added to the FAQ if the moderator

thinks this would be useful. These notes will also be added to the

"Design Decisions" appendix in the Reference Manual for the new

language, which explains why they go into a lot of detail.

In brief (for those too busy to read beyond this paragraph!) Div and

Mod should always round down as this is the correct mathematical

behaviour, however this incurs a performance penalty on the Intel

80x86 CPU. To reduce the cost assembler routines are provided. Using

these routines Div and Mod cost ~80 cycles instead of ~40.

Appendix 8.2 - Design Decisions - Div and Mod operators

-------------------------------------------------------

The Div and Mod operators are used for integer division and integer

remainder (modulus). Existing languages have several interpretations

of these operators so any new language must fully define its operators

for positive and negative operands.

Existing Systems

----------------

For an integer numerator (n) and integer divisor (d) with

d <> 0, quotient (q) and remainder (r) all implementations

agree on the following:

q = n Div d

r = n Mod d

n = q*d + r

|r| < |d|

and by rearrangement:

n Mod d = n - (n Div d) * d

These rules state Div and Mod produce a quotient and a

remainder, which can be put back into the formula q*d + r

to generate the original number. Also the absolute value of

the remainder is less than the absolute value of the divisor.

These rules do not fully define the behaviour of Div and

Mod as there are several rounding systems compatible with

the rules:

Div rounds down Div rounds to 0 n Mod d >= 0

<--------------> <--------------> <-------------->

n d n Div d n Mod d n Div d n Mod d n Div d n Mod d

----------------------------------------------------------------

6 3 2 0 2 0 2 0

5 3 1 2 1 2 1 2

4 3 1 1 1 1 1 1

3 3 1 0 1 0 1 0

2 3 0 2 0 2 0 2

1 3 0 1 0 1 0 1

0 3 0 0 0 0 0 0

-1 3 -1 2 0 -1 -1 2

-2 3 -1 1 0 -2 -1 1

-3 3 -1 0 -1 0 -1 0

-4 3 -2 2 -1 -1 -2 2

-5 3 -2 1 -1 -2 -2 1

-6 3 -2 0 -2 0 -2 0

6 -3 -2 0 -2 0 -2 0

5 -3 -2 -1 -1 2 -1 2

4 -3 -2 -2 -1 1 -1 1

3 -3 -1 0 -1 0 -1 0

2 -3 -1 -1 0 2 0 2

1 -3 -1 -2 0 1 0 1

0 -3 0 0 0 0 0 0

-1 -3 0 -1 0 -1 1 2

-2 -3 0 -2 0 -2 1 1

-3 -3 1 0 1 0 1 0

-4 -3 1 -1 1 -1 2 2

-5 -3 1 -2 1 -2 2 1

-6 -3 2 0 2 0 2 0

----------------------------------------------------------------

Div rounds down. In this system Div always rounds in a negative

direction, unless the result is exact. This is known as "floored

division". The sign of n Mod d equals the sign of d. If there is

a floor() or int() operator which rounds real numbers in a

negative direction, n Div d = Int(n/d) = floor(n/d). This

rounding system is used by Ada with its "divide out completely"

operator and Mod operator, and by Modula 3.

Div rounds to zero. In this system Div rounds towards zero,

unless the result is exact. This is known as "symmetric division".

The sign of n Mod d equals the sign of n. Also -n Div d = n Div -d

= -(n Div d). This rounding system is used by Ada with its integer

division operator "/" and Rem operator. C and Pascal also use

this rounding system on the Intel 80x86, although this behaviour

is not formally defined for these languages. The signed division

operator in the Intel 80x86 instruction set (idiv) follows this

system.

n Mod d >= 0. In this system the modulus is always >= zero.

Although equally valid, this system is not used in any

languages known to the author.

Mathematical definitions of Div and Mod

---------------------------------------

We have described several different rounding systems. Which of

these systems are in agreement with mathematical number theory?

Number theorists define integer division and remainder operators

for any n and d > 0 as follows:

quotient q = n Div d

remainder r = n Mod d

n = q*d + r

0 <= r <= d-1

Mathematically speaking the Mod operator defines which of d

classes an integer n falls into. Class [0] denotes integers in

the form n = qd, Class [1] denotes integers n = qd + 1, and

so on up to Class [d-1] for integers n = qd + (d-1). For

example -d, 0 and d are in Class [0], -d+1, 1, d+1 are in

Class [1], and -1, d-1 are in Class [d-1]. Since in this

definition the classes are numbered 0 to d-1, and since the

Mod operator denotes a class, Mod must return an integer

0 to d-1. Thus n Mod d, for any integer n, is always >= 0.

Therefore a rounding system conformant with number theory

should observe the following:

quotient q = n Div d

remainder r = n Mod d

n = q*d + r

If d > 0 then 0 <= r <= d-1

If d < 0 then 0 >= r >= d+1

The last rule (for d<0) is added for symmetry. Number theory

does not cover values of d < 0, so for these values an

implementation is free to return any remainder r consistent

with n = qd + r and |r| < |d|.

Also in number theory the identity -n Div d = n Div -d =

-(n Div d) does not apply because the integers do not form

a field under division. Since integers do not have

multiplicative inverses the division operator is not

fully defined for integers. Therefore the mathematically

correct rules of Div and Mod do not conform to this

identity.

Proposed Implementation

-----------------------

The new language follows the rounding system defined by

the mathematical rules, since this is the behaviour expected

by number theory and there is no compelling reason to

adopt an alternate system. The rules specify r >= 0

for any value of n and for d > 0, which is equivalent

to Mod returning values >= 0 for d > 0, and <= 0 for

d < 0. In this rule the sign of n Mod d = the sign of d

and Div always rounds down.

This differs from C and Pascal on the Intel 80x86 which

round to zero. Although most languages round to zero, and

the signed division opcode on most CPUs round to zero,

this behaviour does not conform to number theory and

need not be followed by a new language. The proposed

implementation therefore follows the mathematical

definitions outlined previously.

The cost of this decision is Div and Mod can no longer

be implemented with a single opcode in the 80x86 instruction

set, and an assembler routine is required instead:

;Implementation of Div and Mod rounding downwards.

;Inputs n=numerator, d=divisor

;Calculates q=quotient, r=remainder

;Uses eax, ecx and edx registers

;if d<0 n=n-1 and make d positive

mov eax,n

mov ecx,d

cmp ecx,0

jge @d_ge_zero

dec eax

neg ecx

@d_ge_zero:

;if (adjusted) n<0 make n positive

cmp eax,0

jge @adj_n_ge_zero

inc eax

neg eax

@adj_n_ge_zero:

;numerator (>=0) in edx:eax, divisor (>0) in ecx, quotient -> eax

mov edx,0

idiv ecx

;if n=0 div is done

mov edx,n

cmp edx,0

je @div_done

;if n,d have opposite signs make q negative

xor edx,d

cmp edx,0

jge @div_done

neg eax

dec eax

;save q

@div_done:

mov ecx,q

mov [ecx],eax

;Now calculate the modulus (quotient is in eax)

imul d

mov ecx,n

sub ecx,eax

mov eax,ecx

mov edx,r

mov [edx],eax

These routines may seem a heavy penalty to pay for a

mathematically correct definition. However after examining

the instruction timings idiv dwarfs all the other opcodes.

On the 486 or Pentium idiv is about 40 cycles, imul is

14, and mov, neg, inc, and sub are 1 cycle each. Thus

although the amount of code has increased very considerably,

the execution time is approximately 80 cycles instead of 40.

Compilable implementations using Borland Delphi inline

assembler for 32 bit Windows are at the end of this

document.

Other comments

--------------

While deciding the best implementation of Div and Mod,

some other interesting questions were addressed.

Under downward rounding the identity -n Div d = n Div -d =

-(n Div d) has no basis in number theory and does not apply,

and the identity -n/d = n/-d = -(n/d) holds only for reals

and not integers. This means a language using "/" for

integer division is more or less obliged to use symmetric

rounding, as this is the only system for which the identity

-n/d = n/-d = -(n/d) applies to integers and reals. (It

would be confusing if a language used downwards rounding

and "/" for integer division as the identity would hold

for reals only). This is probably the reason why C, which

overloads "/" for integers and reals, uses symmetric

rounding.

Should a language provide several pairs of operators,

allowing both downwards and symmetric rounding? Ada does

this, using "divide out completely" and Mod for downwards

rounding, and integer division "/" and Rem for symmetric

rounding. Although superficially attractive, this option

is rejected for the following reasons:

1. The principle of clarity in language design requires

tasks to be conceptually distinct and to be carried out

using clearly differentiated commands. It is hard to

think of a greater violation of this rule than having

two similar pairs of operators, especially considering

their behaviour is identical in the 1st and 3rd quadrants

and only differs in the 2nd and 4th.

2. There is no consensus among language designers which

rounding system is preferred. So what chance has the

average programmer got of making the correct choice?

3. If language designers side step difficult choices by

passing them onto the user, the result is a language that

is more complex and less clear than needed. A new language

should make a bold decision as to which is the correct

choice, and should move forwards through simplicity and

power, rather that trying to be all things to all persons.

Define MININT as the lowest possible signed integer, which

on the Intel architecture has the sign bit set and all other

bits clear. For 32 bit integers this is 0x80000000 or

-2,147,483,648. Are -1 Mod MININT and MININT Mod -1 defined?

-1 Mod MININT. Under downwards and symmetric rounding, -1

Mod (any negative number) should = -1. -1 Mod MAXINT

(0x7fffffff) is also interesting as it should equal MAXINT-1.

MININT Mod -1. Under downwards and symmetric rounding, anything

Mod -1 should equal zero.

If a language has an operator for rounding reals to integers

the rounding should be consistent with the rounding used

for Div and Mod. For example if a language adopted downwards

rounding and used Int to convert real to integer, Int should

also round down. Then Int(n/d) = n Div d. This means Mod

can be equally defined:

n Mod d = n - (n Div d)*d

n Mod d = n - Int(n/d)*d

Implementation

--------------

function IntMod (const n,d:longint):longint;

(*

We can modify eax,ecx and edx. All other registers must be preserved.

*)

assembler; pascal;

asm

(* if d<0 n=n-1 and make d positive *)

mov eax,n

mov ecx,d

cmp ecx,0

jge @d_ge_zero

dec eax

neg ecx

@d_ge_zero:

(* if (adjusted) n<0 make n positive *)

cmp eax,0

jge @adj_n_ge_zero

inc eax

neg eax

@adj_n_ge_zero:

(* numerator (>=0) in edx:eax, divisor (>0) in ecx, quotient -> eax *)

mov edx,0

idiv ecx

(* if n=0 div is done *)

mov edx,n

cmp edx,0

je @div_done

(* if n,d have opposite signs make q negative *)

xor edx,d

cmp edx,0

jge @div_done

neg eax

dec eax

(* Now calculate the modulus and return in eax *)

@div_done:

imul d

mov ecx,n

sub ecx,eax

mov eax,ecx

end;

function IntDiv (const n,d:longint):longint;

(*

Implementation of integer division using downwards rounding.

n,d,q,r - Numerator, Divisor, Quotient and Remainder as signed integers.

The routine works by making n,d positive, doing a signed division,

then adjusting the results. We can modify eax,ecx and edx and the

function return value is stored in eax. All other registers must

be preserved.

*)

assembler; pascal;

asm

(* if d<0 n=n-1 and make d positive *)

mov eax,n

mov ecx,d

cmp ecx,0

jge @d_ge_zero

dec eax

neg ecx

@d_ge_zero:

(* if (adjusted) n<0 make n positive *)

cmp eax,0

jge @adj_n_ge_zero

inc eax

neg eax

@adj_n_ge_zero:

(* numerator (>=0) in edx:eax, divisor (>0) in ecx, quotient -> eax *)

mov edx,0

idiv ecx

(* if n=0 div is done *)

mov edx,n

cmp edx,0

je @div_done

(* if n,d have opposite signs make q negative *)

xor edx,d

cmp edx,0

jge @div_done

neg eax

dec eax

(* q is returned in eax *)

@div_done:

end;

--end--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.