In IEEE 754 floating point, it is possible that <pre class="prettyprint"><code>a*(b-c) != a*b-a*c // a, b, c double </code></pre> So expansion of a product is not guaranteed to be equal to the unexpanded product. But what about this: <pre class="prettyprint"><code>a*(b1+b2+...+bn) == a*b1+a*b2+...+a*bn // b1==b2==...==bn </code></pre> When all <code>b</code> equal, is equivalence guaranteed (in case of no under-/overflow)? Is there a difference if equality of <code>b</code> is known at compile time or not? Edit: It is not - see Eric Postpischil and Pascal Cuoq. But maybe holds the weaker assertion?: <pre class="prettyprint"><code> (1.0/n)*(b1+b2+...+bn) <= 1.0 && (1.0/n)*b1+(1.0/n)*b2+...+(1.0/n)*bn <= 1.0 // when all b<=1.0 and n integral double but not power of 2 // so that 1.0/n not exactly representable with base-2 floating point </code></pre> I simply wonder if you can guarantee that the average of a data set does not exceed some value that is also not exceeded by every single data value, no matter how you compute the average (first adding and once dividing, or adding every value divided). Edit2: Ok, the <code>&&</code> doesn't hold. See Eric Postpischil and David Hammen: <pre class="prettyprint"><code>average of nine 1.0 -> only first condition is true, second exceeds. average of ten 1.0/3 -> only second condition is true, first exceeds. </code></pre> Is then the optimal method of computation of an average dependent of upper expected limit of data set? Or also of size (that means <code>n</code>) of data set? Or is there no optimal method surely existing?

Even excluding overflow and underflow, the results may differ. <code>.3 * (.3+.3+.3)</code> is <code>0.269999999999999962252417162744677625596523284912109375</code> while <code>.3*.3 + .3*.3 + .3*.3</code> is <code>0.270000000000000017763568394002504646778106689453125</code> (when both are evaluated with IEEE-754 rules and 64-bit binary floating-point). Regarding the questions added in an update: There are two questions, one of which asks whether the computed average of a set of numbers each not exceeding one can exceed one. As David Hammon points out, calculating the average of nine 1s as <code>1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1</code> produces 1.0000000000000002220446049250313080847263336181640625. The other asks whether the computed average of a set can exceed all of the numbers in the set. This is a superset of the first question, but here is an example using a different calculation of the average: Let b = <code>1./3</code> (which is 0.333333333333333314829616256247390992939472198486328125). Then <code>1./10 * (b+b+b+b+b+b+b+b+b+b)</code> is 0.33333333333333337034076748750521801412105560302734375, which is greater than <code>b</code>.

floating point product expansion equivalence

Tags:

c++

floating-point

In IEEE 754 floating point, it is possible that

a*(b-c) != a*b-a*c // a, b, c double

So expansion of a product is not guaranteed to be equal to the unexpanded product.

But what about this:

a*(b1+b2+...+bn) == a*b1+a*b2+...+a*bn // b1==b2==...==bn

When all b equal, is equivalence guaranteed (in case of no under-/overflow)? Is there a difference if equality of b is known at compile time or not?

Edit:

It is not - see Eric Postpischil and Pascal Cuoq.

But maybe holds the weaker assertion?:

   (1.0/n)*(b1+b2+...+bn) <= 1.0
&& (1.0/n)*b1+(1.0/n)*b2+...+(1.0/n)*bn <= 1.0

// when all b<=1.0 and n integral double but not power of 2
// so that 1.0/n not exactly representable with base-2 floating point

I simply wonder if you can guarantee that the average of a data set does not exceed some value that is also not exceeded by every single data value, no matter how you compute the average (first adding and once dividing, or adding every value divided).

Edit2:

Ok, the && doesn't hold. See Eric Postpischil and David Hammen:

average of nine 1.0 -> only first condition is true, second exceeds.
average of ten 1.0/3 -> only second condition is true, first exceeds.

Is then the optimal method of computation of an average dependent of upper expected limit of data set? Or also of size (that means n) of data set? Or is there no optimal method surely existing?

664

asked Feb 10 '14 12:02

mb84

3 Answers

Even excluding overflow and underflow, the results may differ.

.3 * (.3+.3+.3) is 0.269999999999999962252417162744677625596523284912109375 while .3*.3 + .3*.3 + .3*.3 is 0.270000000000000017763568394002504646778106689453125 (when both are evaluated with IEEE-754 rules and 64-bit binary floating-point).

Regarding the questions added in an update:

There are two questions, one of which asks whether the computed average of a set of numbers each not exceeding one can exceed one. As David Hammon points out, calculating the average of nine 1s as 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 + 1./9*1 produces 1.0000000000000002220446049250313080847263336181640625.

The other asks whether the computed average of a set can exceed all of the numbers in the set. This is a superset of the first question, but here is an example using a different calculation of the average:

Let b = 1./3 (which is 0.333333333333333314829616256247390992939472198486328125).

Then 1./10 * (b+b+b+b+b+b+b+b+b+b) is 0.33333333333333337034076748750521801412105560302734375, which is greater than b.

152

answered Oct 23 '22 10:10

Eric Postpischil

IEEE 754 doesn't delve too much into language details. In particular, details like "compile time" aren't specified. That doesn't even make sense for some languages.

The easiest case to understand is when you have an intermediate infinity. Assume that sum(b)==INF, barely, but a is 0.5. The result a*sum(b) is still INF. However, by multiplying first, the subsequent addition no longer overflows.

answered Oct 23 '22 12:10

MSalters

But maybe holds the weaker assertion?:

(1.0/n)*(b1+b2+...+bn) <= 1.0
&& (1.0/n)*b1+(1.0/n)*b2+...+(1.0/n)*bn <= 1.0

No. For example, this assertion fails on my computer with n=9 and b_i=1.0.

I simply wonder if you can guarantee that the average of a data set does not exceed some value that is also not exceeded by every single data value, no matter how you compute the average (first adding and once dividing, or adding every value divided).

Once again, the answer is no. The correlation E[(X-Xbar)*(Y-Ybar)]/(sigma_x * sigma_y) between two random variables should always be between -1.0 and 1.0. Yet if you compute the statistics for two perfectly correlated (or perfectly anti-correlated) random variables, you'll oftentimes see a correlation that is slightly greater than +1 (or slightly less than -1).

answered Oct 23 '22 11:10

David Hammen

Related questions
                            
                                QML Crash after Deleting Items from QAbstractListModel
                            
                                what's the meaning of (base->*&Func)() in C++
                            
                                PathGetArgs/PathRemoveArgs vs. CommandLineToArgvW - is there a difference?
                            
                                Get objectname (as seen from Qt Designer) from QWidget?
                            
                                call a static JNI method to return a String from C++
                            
                                C++ unary function for logical true
                            
                                Mesh visualizing PCL 1.6 using PCLVisualizer
                            
                                Switching between multiple ui forms in Qt
                            
                                Lock-free data structures in C++ = just use atomics and memory-ordering?
                            
                                Qt Android: Pressing "Done" does not hide the keyboard
                            
                                make_signed<unsigned long>::type is int?
                            
                                Prevent implicit conversion of constructor arguments to external library type
                            
                                Generating a compile-time constant integer from a literal string
                            
                                How do I run a program from another program and pass data to it via stdin in c or c++?
                            
                                Handling drag and drop files in a running Windows console application
                            
                                Return by value does not create a new object [duplicate]
                            
                                How can I implement a map with different data types as values?
                            
                                boost:asio. Download image file from server
                            
                                Why does C++ standard queue have back function when stack doesn't have bottom function?
                            
                                Read from string into stringstream

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With