The hardest logarithm to compute
Suppose you want to compute the natural logarithms of every floating point number, correctly truncated to a floating point result. Here by floating point number we mean an IEEE standard 64-bit float, what C calls a double. Which logarithm is hardest to compute?
We'll get to the hardest logarithm shortly, but we'll first start with a warm up problem. Suppose you needed to compute the first digit of a number z. You know that z equals 2 + I where I is either 10-100 or -10-100. If I is positive, you should return 2. If I is negative, you should return 1. You know z to very high precision a priori, but to return the first digit correctly, you need to compute z to 100 decimal places.
In this post we're truncating floating point numbers, i.e. rounding down, but similar considerations apply when rounding to nearest. For example, if you wanted to compute 0.5 + I rounded to the nearest integer.
In general, it may not be possible to know how accurately you need to compute a value in order to round it correctly. William Kahan called this the table maker's dilemma.
Worst case for logarithmLefivre and Muller [1] found that the floating point number whose logarithm is hardest to compute is
x = 1.0110001010101000100001100001001101100010100110110110 i- 2678
where the fractional part is written in binary. The log of x, again written in binary, is
111010110.0100011110011110101110100111110010010111000100000000000000000000000000000000000000000000000000000000000000000111 "
The first 53 significant bits, all the bits a floating point number can represent [2], are followed by 65 zeros. We have to compute the log with a precision of more than 118 bits in order to know whether the last bit of the fraction part of the float representation should be a 0 or a 1. Usually you don't need nearly that much precision, but this is the worst case.
Mathematica codeHere is Mathematica code to verify the result above. I don't know how to give Mathematica floating point numbers in binary, but I know how to give it integers, so I'll multiply the fraction part by 252 and take 52 from the exponent.
n = FromDigits[ "10110001010101000100001100001001101100010100110110110", 2]x = n 2^(678 - 52)BaseForm[N[Log[x], 40], 2]
This computes the log of x to 40 significant decimal figures, between 132 and 133 bits, and gives the result above.
Related postsA few days ago I wrote about another example by Jean-Michel Muller: A floating point oddity.
William Kahan also came up recently in my post on summing an array of floating point numbers.
***
[1] Vincent Lefivre, Jean-Michel Muller. Worst Cases for Correct Rounding of the Elementary Functions in Double Precision. November 2000. Research Report 2000-35. Available here.
[2] There are 52 bits in the the faction of an IEEE double. The first bit of a fraction is 1, unless a number is subnormal, so the first bit is left implicit, squeezing out one extra bit of precision. That is, storing 52 bits gives us 53 bits of precision.