Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does the C++ standard allow this floating-point behaviour?

In the following code:

#include <cstdint>
#include <cinttypes>
#include <cstdio>

using namespace std;

int main() {
    double xd = 1.18;
    int64_t xi = 1000000000;

    int64_t res1 = (double)(xi * xd);

    double d = xi * xd;
    int64_t res2 = d;

    printf("%" PRId64"\n", res1);
    printf("%" PRId64"\n", res2);
}

Using v4.9.3 g++ -std=c++14 targeting 32-bit Windows I get output:

1179999999
1180000000

Are these values allowed to be different?

I expected that, even if the compiler uses a higher internal precision than double for the computation of xi * xd, it should do this consistently. Loss of precising in floating conversion is implementation-defined, and also the precision of this calculation is implementation-defined - [c.limits]/3 says that FLT_EVAL_METHOD should be imported from C99. IOW I expected that it should not be allowed to use a different precision for xi * xd on one line than it does on another line.

Note: This is intentionally a C++ question and not a C question - I believe the two languages have different rules in this area.

like image 916
M.M Avatar asked Dec 21 '15 05:12

M.M


People also ask

Does C support float?

C, C++, C# and many other programming languages recognize float as a data type. Other common data types include int and double.

What is floating point in C programming?

A "floating-point constant" is a decimal number that represents a signed real number. The representation of a signed real number includes an integer portion, a fractional portion, and an exponent. Use floating-point constants to represent floating-point values that can't be changed.


Video Answer


1 Answers

even if the compiler uses a higher internal precision than double for the computation of xi * xd, it should do this consistently

Whether required or not (discussed below), this clearly doesn't happen: Stackoverflow is littered with questions from people who've seen similar-seeming calculations change for no ostensible reason within the same program.

The C++ Standard draft n3690 says (emphasis mine):

The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.62

62) The cast and assignment operators must still perform their specific conversions as described in 5.4, 5.2.9 and 5.17.

So - in agreement with M.M.'s comment and contrary to my earlier edit - it's the version with the (double) cast that must be rounded to a 64-bit double - which evidently happens to be >= 1180000000 in the run documented in the question - before truncation to integer. The more general case sans 62) leaves the compiler freedom not to round early in the other case.

[c.limits]/3 says that FLT_EVAL_METHOD should be imported from C99. IOW I expected that it should not be allowed to use a different precision for xi * xd on one line than it does on another line.

Check the cppreference page:

Regardless of the value of FLT_EVAL_METHOD, any floating-point expression may be contracted, that is, calculated as if all intermediate results have infinite range and precision (unless #pragma STDC FP_CONTRACT is off)

As tmyklebu comments, it continues:

Cast and assignment strip away any extraneous range and precision: this models the action of storing a value from an extended-precision FPU register into a standard-sized memory location.

This last agrees with the "62)" part of the Standard.

M.M. comments:

STDC FP_CONTRACT does not seem to appear in the C++ Standard and also it's not clear to me exactly to what extent the C99 behaviour is 'imported'

Doesn't appear in the draft I looked at. That suggests C++ doesn't guarantee its availability, leaving the default mentioned above of "any floating-point expression may be contracted", but we know per M.M. comments and the Standard and cppreference quotes above the (double) cast is an exception forcing rounding to 64 bits.

The C++ Standard draft mentioned above says of <cfloat>:

The contents are the same as the Standard C library header . See also: ISO C 7.1.5, 5.2.4.2.2, 5.2.4.2.1.

If one of those C Standards required STDC FP_CONTRACT there's more chance of it being portable for use by C++ programs, but I've not surveyed implementations for support.

like image 169
Tony Delroy Avatar answered Sep 22 '22 12:09

Tony Delroy