First we shift the input left until a one is shifted out while subtracting
the fixed point representation of ln(2) from the answer - initialised to
32*ln(2). The typical normization step. Obviously this stage can be speeded
up with binary search approaches.
Start with a 32bit floating point register initialized to 32*log(2) = 32 * 0.69314718.. = 22.180710 .
Now in a loop, shift the input to the left and subtract log(2) from the register until the input overflows to the left with a one. If started with a 1.00, the register holds now a 0.00 . For each power of 2 greater 1, once log2 is in the register.
So now we have the value 1+x+y in the form of a 32-bit fixed point number
x+y with the binary point at the left. x, here, is the top eight bits and y
the next 24. So 0 <= x < 1 is in steps of 1/256, and 0 <= y < 1/256.
input = 1.xxxxxxxx yyyyyyyy yyyyyyyy yyyyyyyy
We want ln(1+x+y)
= ln((1+x)*(1+y/(1+x))
= ln(1+x) + ln(1+y/(1+x)). (a)
The first term of (a) has only 256 values, and so can be looked up from a
table. The second term is small, which we'll now estimate.
Define z = y/(1+x), then
ln(1+z) is approximately z - z*z/2 +...
note that the z cubed and higher terms are smaller than 1 lsb and to a good
approximation we can correct ln(1+z) from the linear with a small table
because no more than eight bits are significant.
OK, to make this a bit more explicit, lets use upper case letters for the
binary (whole) numbers representing the various fractions. X = x*(2**8); Y =
y*(2**32). Then the original normalized number is 1 + X/(2**8) + Y/(2**32).
Look up ln(1+X/(2**8)) in a table indexed by X using a 24-bit fraction
format - call this L.
Set Z = Y/((2**8)+X) = z * (2**24), so its already in the 24-bit fraction
format and less than 2**16.
Add Z to L.
Subtract a correction corresponding to the z*z/2 term by using (Z>>8) as an
index into a byte-wide table. In the 24-fraction format the correction is
roughly Z*Z/2/(2**24) or (Z/(2**8))*(Z/(2**8))/(2**9). So max correction is
< 128.
So the longest step is the divide, which is actually 24-bit/9-bit. This
could be speeded up on ARM7 by using the 32x32->64 multiply, looking up the
multipliers corresponding to 1/(1+x).
|