Let's say I have three 32-bit floating point values, a, b, and c, such that (a + b) + c != a + (b + c). Is there a summation algorithm, perhaps similar to Kahan summation, that guarantees that these values can be summed in any order and always arrive at the exact same (fairly accurate) total? I'm looking for the general case (i.e. not a solution that only deals with 3 numbers).
Is arbitrary precision arithmetic the only way to go? I'm dealing with very large data sets, so I'd like to avoid the overhead of using arbitrary precision arithmetic if possible.
Thanks!
There's an interesting 'full-precision-summation' algorithm here, which guarantees that the final sum is independent of the order of the summands (recipe given in Python; but it shouldn't be too difficult to translate to other languages).  Note that the recipe as given in that link isn't perfectly correct: the main accumulation loop is fine, but in the final step that converts the list of accumulated partial sums to a single floating-point result (the very last line of the msum recipe), one needs to be a little bit more careful than simply summing the partial sums in order to get a correctly-rounded result.  See the comments below the recipe, and Python's implementation (linked below) for a way to fix this.
It does use a form of arbitrary-precision arithmetic to hold partial sums (the intermediate sums are represented as 'non-overlapping' sums of doubles), but may nevertheless be fast enough, especially when all the inputs are of roughly the same magnitude. And it always gives a correctly rounded result, so accuracy is as good as you could hope for and the final sum is independent of the order of the summands. It's based on this paper (Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates) by Jonathan Shewchuk.
Python uses this algorithm for its implementation of math.fsum, which does correctly-rounded order-independent summation; you can see the C implementation that Python uses here--- look for the math_fsum function.
With some additional information about the terms you have to sum, you can avoid the overhead of Shewchuk's algorithm.
In IEEE 754 arithmetic, x-y is exact whenever y/2 <= x <= 2*y (Sterbenz theorem, formally proved here)
So if you can arrange all your terms in an order such that each partial sum is of the form above, then you get the exact result for free.
I am afraid that in practice there is little chance of being in conditions where this is assured to happen. Alternating positive and negatives numbers with increasing magnitudes may be one case where it happens.
Note: the original question was about an algorithm that would give the same result regardless of the summation order. Mark's answer initiated a drift in the direction of "an exact algorithm", but reading again your question, I am afraid that I am pushing things too far when I am suggesting to reorder terms. You probably can't in what you are trying to do, and my answer is probably off-topic. Well, sorry :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With