Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does a division result differ based on the cast type? (Followup)

This is a followup to this question: Why does a division result differ based on the cast type?

Quick Summary:

byte b1 = (byte)(64 / 0.8f); // b1 is 79
int b2 = (int)(64 / 0.8f); // b2 is 79
float fl = (64 / 0.8f); // fl is 80

The question is: Why are the results different depending on the cast type? While working out an answer I ran into an issue I wasn't able to explain.

var bytes = BitConverter.GetBytes(64 / 0.8f).Reverse(); // Reverse endianness
var bits = bytes.Select(b => Convert.ToString(b, 2).PadLeft(8, '0'));
Console.WriteLine(string.Join(" ", bits));

This outputs the following:

01000010 10100000 00000000 00000000

Breaking it down in IEEE 754 format:

0 10000101 01000000000000000000000

Sign:

0 => Positive

Exponent:

10000101 => 133 in base 10

Mantissa:

01000000000000000000000 => 0*2^-1 + 1*2^-2 + 0*2^-3 ... = 1/4 = 0.25

Decimal Representation:

(1 + 0.25) * 2^(133 - 127) (Subtract single precision bias)

This results in exactly 80. So why does casting the result make a difference?

like image 359
ConditionRacer Avatar asked Sep 06 '14 19:09

ConditionRacer


2 Answers

My answer in the other thread is not entirely correct: Actually, when computed at runtime, (byte)(64 / 0.8f) is 80.

When casting a float containing the result of 64 / 0.8f, to byte at runtime, the result actually is 80. However, this is not the case when the cast is done as a part of the assignment:

float f1 = (64 / 0.8f);

byte b1 = (byte) f1;
byte b2 = (byte)(64 / 0.8f);

Console.WriteLine(b1); //80
Console.WriteLine(b2); //79

While b1 contains the expected result, b2 is off. According to the disassembly, b2 is assigned as following:

mov         dword ptr [ebp-48h],4Fh 

Thus, the compiler seems to calculate a different result from the result at runtime. I don't know, however, if this is the expected behavior or not.

EDIT: Maybe it is the effect Pascal Cuoq described: During compile time, the C# compiler uses double to calculate the expression. This results in 79,xxx which is truncated to 79 (as a double contains enough precision to cause an issue, here).
Using float, however, we don't actually run into an issue, as the floating-point "error" happens not within the range of a float.

During runtime, this one also prints 79:

double d1 = (64 / 0.8f);
byte b3 = (byte) d1;
Console.WriteLine(b3); //79

EDIT2: As of request of Pascal Cuoq, I ran the following code:

int sixtyfour = Int32.Parse("64");
byte b4 = (byte)(sixtyfour / 0.8f);
Console.WriteLine(b4); //79

Result is 79. So the above statement that the compiler and the runtime calculate a different result is not true.

EDIT3: When changing the previous code to (credits to Pascal Cuoq, again), the result is 80:

byte b5 = (byte)(float)(sixtyfour / 0.8f);
Console.WriteLine(b5); //80

Note, however, that this is not the case when writing (results in 79):

byte b6 = (byte)(float)(64 / 0.8f);
Console.WriteLine(b6); //79

So here is what seems to be happening: (byte)(64 / 0.8f) is not evaluated as a float, but evaluated as a double (before casting it to byte). This results in a rounding error (which does not occur when the calculation is done using float). An explicit cast to float before casting to double (which is marked as redundant by ReSharper, BTW) "solves" this issue. However, when the calculation is done during compile time (possible when using constants only), the explicit cast to float seems to be ignored / optimized away.

TLDR: Floating point calculations are even more complicated than they initially seem.

like image 97
Matthias Avatar answered Sep 22 '22 06:09

Matthias


The C# language specification allows to compute intermediate floating-point results at a precision greater than that of the type. This is very likely what is happening here.

While 64 / 0.8 computed to higher precision is slightly lower than 80 (because 0.8cannot be represented exactly in binary floating-point), and converts to 79 when truncated to an integer type, if the result of the division is converted to float, it is rounded to 80.0f.

(Conversions from floating-point to floating-point are to the nearest—technically, they are done according to the rounding mode of the FPU, but C# does not allow changing the rounding mode of the FPU from its “to the nearest” default. Conversions from floating-point to integer types truncate.)

like image 27
Pascal Cuoq Avatar answered Sep 23 '22 06:09

Pascal Cuoq