64 bit real I/O

agnew@rc.trw.com (R.A. Agnew)
Tue, 18 Jul 89 10:36:35 PDT

          From comp.compilers

Related articles
64 bit real I/O agnew@rc.trw.com (1989-07-18)
| List of all articles for this month |

From: agnew@rc.trw.com (R.A. Agnew)
Date: Tue, 18 Jul 89 10:36:35 PDT
Followup-To: comp.compilers,poster
Newsgroups: comp.compilers
Keywords: IEEE Floating point

Implementing double-precicion real IO for 64 bit IEEE numbers which can
display more than about 9 or 10 digits correctly seems to be challenging.
I am using the power decomposition algorithm which keeps a table of ten
raised to binary powers.

i.e. pow[m] = 1.0D2**m; m = 0..8

This table is often computed recursively by starting with pow[0] = 10.0D0
which is exactly 4024000000000000h.

pow[0] := 4024000000000000h;
FOR m := 0 TO 7 DO
pow[m] := pow[m+1]*pow[m+1];

To improve accuracy, I also construct a table for negative powers.

i.e. powinv[m] = 1.0D2**-m

This table starts off with 1.0D-1 which is only approximate.

powinv[0] := 3FB999999999999Ah;
FOR m := 0 TO 7 DO
powinv[m] := powinv[m+1]*powinv[m+1];

It can be seen that with 64 bit IEEE arithmetic, pow[0] through pow[3] can
be represented exactly, all others being approximate. Any errors in this table
will degrade the accuracy of double precision IO. Exponents of ten which are
not powers of two are formed by succesively multiplying the entries in the
tables. Since the highest exponent is about 308, this error can be as high
as 6 times the error in one floating multiply. Even so, one could expect to
print 13 or 14 digits correctly. Also, with only 32 bit integers, one needs
to use the tables entries for 3 and 4 to convert floating fractions to
integers. This interaction makes it difficult to derive the table using
iterative techniques. Although this problem can be circumvented by writing
higher precision floating point software for IO conversion routines, I wish
to implement it with only IEEE 64 bit arithmetic. Also using a FPU such as
an 80x87 with 80 bit floating arithmetic would help, however I do not have
this luxury.

The following are the power tables I generated recursively using 64
bit arithmetic and rounding:

pow[0] := 4024000000000000h;
pow[1] := 4059000000000000h;
pow[2] := 40C3880000000000h;
pow[3] := 4197D78400000000h;
pow[4] := 4341C37937E08000h;
pow[5] := 4693B8B5B5056E17h;
pow[6] := 4D384F03E94097BBh;
pow[7] := 5A827748F9310CE6h;
pow[8] := 75154FDD7F75E888h;

powinv[0] := 3FB999999999999Ah;
powinv[1] := 3F847AE147AE147Ah;
powinv[2] := 3F1A36E2EB1C432Ah;
powinv[3] := 3E45798EE230F511h;
powinv[4] := 3C9CD2B297DA4EF6h;
powinv[5] := 3949F623D5AC4AF3h;
powinv[6] := 32A50FFD44FAF6F0h;
powinv[7] := 255BBA08CF9DDDEFh;
powinv[8] := 0AC8062864CACDB1h;

As mentioned, numbers with composite exponents will have more error, however
one could hope to be able to write the table itself with 14 or 15 digits of
accuracy. Using the table above this produces:




Clearly, this can be greatly improved with better tables. Since I do not have
the hardware to compute them, perhaps someone out there has already done so
or can steer me to the right source. All comments and suggestions will be
greatly appreciated. Please post replies to comp.compilers or email to:


Thanks in advance. -- Bob Agnew
[I never had any trouble getting 13 or 14 digits from IEEE floating point
numbers, using simple algorithms that multiplied or divided by 10 until the
number was in [1,10), then extracting the integer part, subtracting and
multiplying until I had as many digits as needed, then rounding. Can anyone
see what this fellow's problem is? -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.