I am aware of a similar question, but I want to ask for people opinion on my algorithm to sum floating point numbers as accurately as possible with practical costs.
Here is my first solution:
put all numbers into a min-absolute-heap. // EDIT as told by comments below
pop the 2 smallest ones.
add them.
put the result back into the heap.
continue until there is only 1 number in the heap.
This one would take O(n*logn) instead of normal O(n). Is that really worth it?
The second solution comes from the characteristic of the data I'm working on. It is a huge list of positive numbers with similar order of magnitude.
a[size]; // contains numbers, start at index 0
for(step = 1; step < size; step<<=1)
for(i = step-1; i+step<size; i+=2*step)
a[i+step] += a[i];
if(i < size-1)
a[size-1] += a[i];
The basic idea is to do sum in a 'binary tree' fashion.
Note: it's a pseudo C code. step<<=1
means multiply step by 2.
This one would take O(n).
I feel like there might be a better approach. Can you recommend/criticize?
Kahan's summation algorithm is significantly more precise than straightforward summation, and it runs in O(n) (somewhere between 1-4 times slower than straightforward summation depending how fast floating-point is compared to data access. Definitely less than 4 times slower on desktop hardware, and without any shuffling around of data).
Alternately, if you are using the usual x86 hardware, and if your compiler allows access to the 80-bit long double
type, simply use the straightforward summation algorithm with the accumulator of type long double
. Only convert the result to double
at the very end.
If you really need a lot of precision, you can combine the above two solutions by using long double
for variables c
, y
, t
, sum
in Kahan's summation algorithm.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With