It seems that we can trivially derive floats that are smaller than numeric_limits<float>::min()
. Why. If numeric_limits<float>::min()
isn't supposed to be the smallest positive float, what is it supposed to be?
#include <iostream>
#include <limits>
using namespace std;
int main(){
float mind = numeric_limits<float>::min();
float smaller_than_mind = numeric_limits<float>::min()/2;
cout<< ( mind > smaller_than_mind && smaller_than_mind > 0 ) <<endl;
}
Run it here: https://onlinegdb.com/ry3AcxjXz
numeric_limits::minReturns the minimum finite value representable by the numeric type T . For floating-point types with denormalization, min returns the minimum positive normalized value. Note that this behavior may be unexpected, especially when compared to the behavior of min for integral types.
short: min: -32768 max: 32767 int: min: -2147483648 max: 2147483647 long: min: -2147483648 max: 2147483647 float: min: 1.17549e-038 max: 3.40282e+038 double: min: 2.22507e-308 max: 1.79769e+308 long double: min: 2.22507e-308 max: 1.79769e+308 unsigned short: min: 0 max: 65535 unsigned int: min: 0 max: 4294967295 ...
You can use std::numeric_limits which is defined in <limits> to find the minimum or maximum value of types (As long as a specialization exists for the type). You can also use it to retrieve infinity (and put a - in front for negative infinity).
min()
of a floating-point type returns the minimum positive value that has the full expressive power of the format—all bits of its significand are available for use.
Smaller positive values are called subnormal. Although they are representable, high bits of the significand are necessarily zero.
The IEEE-754 64-bit binary floating-point format represents a number with a sign (+ or -, encoded as 0 or 1), an exponent (-1022 to +1023, encoded as 1 to 2046, plus 0 and 2047 as special cases), and a 53-bit significand (encoded with 52 bits plus a clue from the exponent field).
For normal values, the exponent field is 1 to 2046 (representing exponents of -1022 to +1023) and the significand (in binary) is 1.xxx…xxx, where xxx…xxx represents 52 more bits. In all of these values, the value of the lowest bit of the significand is 2-52 times the value of the highest significant bit (the first 1 in it).
For subnormal values, the exponent field is 0. This still represents an exponent of -1022, but it means the high bit of the significand is 0. The significand is now 0.xxx…xxx. As lower and lower values are used in this range, more leading bits of the significand become zero. Now, the value of the lowest bit of the significand is greater than 2-52 times the value of the highest significant bit. You cannot adjust numbers as finely in this interval as in the normal interval because not all the bits of the significand are available for arbitrary values—some leading bits are fixed at 0 to set the scale.
Because of this, the relative errors that occur when working with numbers in this range tend to be greater than the relative errors in the normal range. The floating-point format has this subnormal range because, if it did not, the numbers would just cut off at the smallest normal value, and the gap between that normal value and zero would be a huge relative jump—100% of the value in a single step. By including subnormal numbers, the relative errors increase more gradually, and the absolute errors stay constant from this point until zero is reached.
It is important to know where the bottom of the normal range is. min()
tells you this. denorm_min()
tells you the ultimate minimum positive value.
According to en.cppreference.com:
For floating-point types with denormalization, min returns the minimum positive normalized value. Note that this behavior may be unexpected, especially when compared to the behavior of min for integral types.
float
is a type with denormalization, information on normalized floating point numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With