Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The type of a floating point literal with exponent

What is the type of a floating-point literal having an exponent part, such as the 123456e-3 in C(99+)? Is it of type float or double? When used as a float initializer in float f = 123456e-3; does it need to have a f suffix?

like image 315
Ron Avatar asked Jun 26 '20 14:06

Ron


People also ask

What is a literal floating-point?

Floating-point literals are numbers that have a decimal point or an exponential part. They can be represented as: Real literals. Binary floating-point literals. Hexadecimal floating-point literals (C only)

What is the different data type of floating-point literal?

There are two floating point primitive types. Data type float is sometimes called "single-precision floating point". Data type double has twice as many bits and is sometimes called "double-precision floating point".

What are the two different parts of floating-point literal?

Binary floating-point literals A real binary floating-point constant consists of the following: An integral part. A decimal point.

What is the purpose of the exponent in a floating-point constant?

Floating-point constants specify values that must have a fractional part. Floating-point constants have a "mantissa," which specifies the value of the number, an "exponent," which specifies the magnitude of the number, and an optional suffix that specifies the constant's type(double or float).


2 Answers

By default, all floating point literals, with or without an exponent part, have type double. You can add the f suffix to make the type float or L to make the type long double.

In the case of float f = 123456e-3;, you're initializing a float with a double constant, so there is the possibility of loss of precision, however this particular constant only has 6 decimal digits of precision so it should be OK.

like image 101
dbush Avatar answered Oct 29 '22 03:10

dbush


What is the type of a floating-point literal?

Floating constants

C defines these as floating constants, not literals. Default type is double.
An f or F suffix makes it a float.
An l or L suffix makes it a long double.

[edit] FLT_EVAL_METHOD

C has FLT_EVAL_METHOD which allows constants to be interpreted as a wider type.

Example FLT_EVAL_METHOD == 2

evaluate all operations and constants to the range and precision of the long double type.

In this case, I'd expect v1 and v2 to have the same value when FLT_EVAL_METHOD == 2, but different values when FLT_EVAL_METHOD == 0.

long double v1 = 0.1;
long double v2 = 0.1L;

When used as a float initializer in float f = 123456e-3; does it need to have a f suffix?

For best conversion of the text to float, yes use an f.

float f = 123456e-3 incurs double rounding. 2 rounding occurs: text->double and double to float.

With select values, g may get a different value with float g = x.xxx vs g = x.xxxf;. See following.

double rounding example

Notice f2 and f4 have the same constant except the the f suffix. Compiler warns with f4:

warning: conversion from 'double' to 'float' changes value from '9.9999997019767761e-1' to '1.0e+0f' [-Wfloat-conversion]

#include <stdlib.h>
int main(void) {
  // float has 24 bit significand, double has 53
  float f1 = 0x0.FFFFFFp0f;         // code with 24 bit significand, exact as a float
  printf("%-20a %.17e\n", f1, f1);
  float f2 = 0x0.FFFFFF7FFFFFFCp0f; // code with 54 bit significand, rounds down to nearest float
  printf("%-20a %.17e\n", f2, f2);
  float f3 = 0x0.FFFFFF80000000p0f; // code with 25 bit significand, rounds up to nearest float
  printf("%-20a %.17e\n", f3, f3);
  puts("");
  double d1 = 0x0.FFFFFF7FFFFFF8p0; // code constant with 53 bit significand, exact as a double
  printf("%-20a %.17e\n", d1, d1);
  double d2 = 0x0.FFFFFF7FFFFFFCp0; // code constant with 54 bit significand, rounds up to nearest double
  printf("%-20a %.17e\n", d2, d2);
  float f4 = 0x0.FFFFFF7FFFFFFCp0;  // code constant with 54 bit significand, rounds up to nearest double
                                    // then rounds up again when double converted to float
  printf("%-20a %.17e\n", f4, f4);
  return 0;
}

Output

0x1.fffffep-1        9.99999940395355225e-01
0x1.fffffep-1        9.99999940395355225e-01  f2
0x1p+0               1.00000000000000000e+00

0x1.fffffefffffffp-1 9.99999970197677501e-01
0x1.ffffffp-1        9.99999970197677612e-01
0x1p+0               1.00000000000000000e+00  f4 Double Rounding!

For best conversion of the text to long double, definitely use an L else the constant is only a double with less precision.

long double ld1 = 0x1.00000000000001p1;
printf("%.20Le\n", ld1, ld1);
long double ld2 = 0x1.00000000000001p1L; // "Same" constant as above with an 'L'
printf("%.20Le\n", ld2, ld2);

Output

2.00000000000000000000e+00
2.00000000000000002776e+00
like image 27
chux - Reinstate Monica Avatar answered Oct 29 '22 05:10

chux - Reinstate Monica