Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the actual min/max values for float and double (C++)

I have read recommendations to use the "FLT_MIN" and "FLT_MAX" values for float. Whenever I do this, codeblocks tells me its

max: 3.40282e+038 min: 1.17549e-038

Not knowing what this meant I tried to get real values and got

max: 47.2498237715 min: -34.8045265148

... but these don't clarify things.

Here is a snippet from my code

   char c;         // reserve: 1 byte, store 1 character (-128 to 127)
   int i;          // reserve: 4 bytes, store -2147483648 to 2147483657
   short int s;    // reserve: 2 bytes, store -32768 to 32767
   float f;        // reserve: 4 bytes, store ?? - ?? (? digits)
   double d;       // reserve: 8 bytes, store ?? - ?? (? digits)
   unsigned int u; //reserve: r bytes store 0 to 4294967295

   c = 'c';
   cout << c <<" lives at " << &c <<endl;

   i = 40000;
   cout << i <<" lives at " << &i <<endl;

   s = 100;
   cout << s <<" lives at " << &s <<endl;

   f = 10.1;
   cout << f <<" lives at " << &f <<endl;

   d = 10.102;
   cout << d <<" lives at " << &d <<endl;

   u = 1723;
   cout << u <<" lives at " << &u <<endl;

In the snippet we can clearly see the min-max values of a short int for example at -32768 - 32767. These are proper understandable values, but for float and int, the real values are not clear.

like image 209
user9318444 Avatar asked Feb 05 '18 19:02

user9318444


2 Answers

Alright. Using what I learned from here (thanks everyone) and the other parts of the web I wrote a neat little summary of the two just in case I run into another issue like this.

In C++ there are two ways to represent/store decimal values.

Floats and Doubles

A float can store values from:

  • -340282346638528859811704183484516925440.0000000000000000 Float lowest
  • 340282346638528859811704183484516925440.0000000000000000 Float max

A double can store values from:

  • -179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.0000000000000000 Double lowest

  • 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.0000000000000000 Double max

Float's precision allows it to store a value of up to 9 digits (7 real digits, +2 from decimal to binary conversion)

Double, like the name suggests can store twice as much precision as a float. It can store up to 17 digits. (15 real digits, +2 from decimal to binary conversion)

e.g.

     float x = 1.426;
     double y = 8.739437;

Decimals & Math

Due to a float being able to carry 7 real decimals, and a double being able to carry 15 real decimals, to print them out when performing calculations a proper method must be used.

e.g

include

typedef std::numeric_limits<double> dbl; 
cout.precision(dbl::max_digits10-2); // sets the precision to the *proper* amount of digits.
cout << dbl::max_digits10 <<endl; // prints 17.
double x = 12345678.312; 
double a = 12345678.244; 
// these calculations won't perform correctly be printed correctly without setting the precision.


cout << endl << x+a <<endl;

example 2:

typedef std::numeric_limits< float> flt;
cout.precision(flt::max_digits10-2);
cout << flt::max_digits10 <<endl;
float x =  54.122111;
float a =  11.323111;

cout << endl << x+a <<endl; /* without setting precison this outputs a different value, as well as making sure we're *limited* to 7 digits. If we were to enter another digit before the decimal point, the digits on the right would be one less, as there can only be 7. Doubles work in the same way */

Roughly how accurate is this description? Can it be used as a standard when confused?

like image 140
user9318444 Avatar answered Sep 20 '22 14:09

user9318444


The std::numerics_limits class in the <limits> header provides information about the characteristics of numeric types.

For a floating-point type T, here are the greatest and least values representable in the type, in various senses of “greatest” and “least.” I also include the values for the common IEEE 754 64-bit binary type, which is called double in this answer. These are in decreasing order:

  • std::numeric_limits<T>::infinity() is the largest representable value, if T supports infinity. It is, of course, infinity. Whether the type T supports infinity is indicated by std::numeric_limits<T>::has_infinity.

  • std::numeric_limits<T>::max() is the largest finite value. For double, this is 21024−2971, approximately 1.79769•10308.

  • std::numeric_limits<T>::min() is the smallest positive normal value. Floating-point formats often have an interval where the exponent cannot get any smaller, but the significand (fraction portion of the number) is allowed to get smaller until it reaches zero. This comes at the expense of precision but has some desirable mathematical-computing properties. min() is the point where this precision loss starts. For double, this is 2−1022, approximately 2.22507•10−308.

  • std::numeric_limits<T>::denorm_min() is the smallest positive value. In types which have subnormal values, it is subnormal. Otherwise, it equals std::numeric_limits<T>::min(). For double, this is 2−1074, approximately 4.94066•10−324.

  • std::numeric_limits<T>::lowest() is the least finite value. It is usually a negative number large in magnitude. For double, this is −(21024−2971), approximately −1.79769•10308.

  • If std::numeric_limits<T>::has_infinity and std::numeric_limits<T>::is_signed are true, then -std::numeric_limits<T>::infinity() is the least value. It is, of course, negative infinity.

Another characteristic you may be interested in is:

  • std::numeric_limits<T>::digits10 is the greatest number of decimal digits such that converting any decimal number with that many digits to T and then converting back to the same number of decimal digits will yield the original number. For double, this is 15.
like image 29
Eric Postpischil Avatar answered Sep 21 '22 14:09

Eric Postpischil