With pairwise summation, how many terms do I need to get an appreciably wrong result?

Question

Using a given species of fp numbers, say float16, it is straight forward to construct sums with totally wrong results. For example, using python/numpy:

import numpy as np

one = np.float16(1)
ope = np.nextafter(one,one+one)

np.array((ope,one,-one,-one)).cumsum()
# array([1.001, 2.   , 1.   , 0.   ], dtype=float16)

Here we have used cumsum to force naive summation. Left to its own devices numpy would have used a different order of summation, yielding a better answer:

np.array((ope,one,-one,-one)).sum()
# 0.000977

The above is based on cancellation. To rule out this class of examples, let us only allow non negative terms. For naive summation it is still easy to give examples with very wrong sums. The following sums 10^4 identical terms each equal to 10^-4:

np.full(10**4,10**-4,np.float16).cumsum()
# array([1.0e-04, 2.0e-04, 3.0e-04, ..., 2.5e-01, 2.5e-01, 2.5e-01],
  dtype=float16)

The last term is off by a factor of 4.

Again, allowing numpy to use pairwise summation gives a much better result:

np.full(10**4,10**-4,np.float16).sum()
# 1.0

It is possible to construct sums that beat pairwise summation. Choosing eps below resolution at 1 we can use 1, eps, 0, eps, 3x0, eps, 7x0, eps, 15x0, eps, ..., but this involves an insane number of terms.

My question: Using float16 and only non negative terms, how many terms are required to obtain from pairwise summation a result that is off by at least a factor of 2.

Bonus: Same question with "positive" instead of "non negative". Is it even possible?

David Eisenstat · Accepted Answer

It would take such a large number of terms that it's effectively impossible (if zeros are allowed) or actually impossible (if zeros are not allowed, due to overflow). Wikipedia summarizes some error bounds due to Nicolas Higham. Since all of the terms are nonnegative, the condition number is 1, hence the relative error for n terms is bounded as |E_n|/|S_n| ≤ ε log₂ n / (1 - ε log₂ n), where ε is the machine epsilon. To be off by a factor of two, we would need |E_n| ≥ |S_n|, which is possible only if ε log₂ n ≥ 1/2, which is equivalent to n ≥ 2^{1/(2 ε)} = 2¹⁰²⁴ for float16.

With pairwise summation, how many terms do I need to get an appreciably wrong result?

Tags:

python

algorithm

floating-point

numpy

sum

Paul Panzer

1 Answers

David Eisenstat

Recent Activity

Donate For Us

With pairwise summation, how many terms do I need to get an appreciably wrong result?

Tags:

python

algorithm

floating-point

numpy

sum

Paul Panzer

1 Answers

David Eisenstat

Related questions

Recent Activity

Donate For Us