I was reading about floating-point NaN values in the Java Language Specification (I'm boring). A 32-bit float
has this bit format:
seee eeee emmm mmmm mmmm mmmm mmmm mmmm
s
is the sign bit, e
are the exponent bits, and m
are the mantissa bits. A NaN value is encoded as an exponent of all 1s, and the mantissa bits are not all 0 (which would be +/- infinity). This means that there are lots of different possible NaN values (having different s
and m
bit values).
On this, JLS §4.2.3 says:
IEEE 754 allows multiple distinct NaN values for each of its single and double floating-point formats. While each hardware architecture returns a particular bit pattern for NaN when a new NaN is generated, a programmer can also create NaNs with different bit patterns to encode, for example, retrospective diagnostic information.
The text in the JLS seems to imply that the result of, for example, 0.0/0.0
, has a hardware-dependent bit pattern, and depending on whether that expression was computed as a compile time constant, the hardware it is dependent on might be the hardware the Java program was compiled on or the hardware the program was run on. This all seems very flaky if true.
I ran the following test:
System.out.println(Integer.toHexString(Float.floatToRawIntBits(0.0f/0.0f))); System.out.println(Integer.toHexString(Float.floatToRawIntBits(Float.NaN))); System.out.println(Long.toHexString(Double.doubleToRawLongBits(0.0d/0.0d))); System.out.println(Long.toHexString(Double.doubleToRawLongBits(Double.NaN)));
The output on my machine is:
7fc00000 7fc00000 7ff8000000000000 7ff8000000000000
The output shows nothing out of the expected. The exponent bits are all 1. The upper bit of the mantissa is also 1, which for NaNs apparently indicates a "quiet NaN" as opposed to a "signalling NaN" (https://en.wikipedia.org/wiki/NaN#Floating_point). The sign bit and the rest of the mantissa bits are 0. The output also shows that there was no difference between the NaNs generated on my machine and the constant NaNs from the Float and Double classes.
My question is, is that output guaranteed in Java, regardless of the CPU of the compiler or VM, or is it all genuinely unpredictable? The JLS is mysterious about this.
If that output is guaranteed for 0.0/0.0
, are there any arithmetic ways of producing NaNs that do have other (possibly hardware-dependent?) bit patterns? (I know intBitsToFloat
/longBitsToDouble
can encode other NaNs, but I'd like to know if other values can occur from normal arithmetic.)
A followup point: I've noticed that Float.NaN and Double.NaN specify their exact bit pattern, but in the source (Float, Double) they are generated by 0.0/0.0
. If the result of that division is really dependent on the hardware of the compiler, it seems like there is a flaw there in either the spec or the implementation.
The list of patterns for three bits has 8 lines (patterns). Notice that the first four patterns are a “0” followed by the possible patterns for 2 bits.
How to systematically list all the patterns for N bits. Multiplying powers of two. Bytes, kilobytes, megabytes, and gigabytes. Names for four-bit patterns. Hexadecimal names for bit patterns. Octal names for bit patterns. In most computer documentation, 8 contiguous bits are called a byte.
CHAPTER 3 — Bits and Bit Patterns Computers represent data and instructions with patterns of bits. You must become familiar with bit patterns! This chapter will help you. It discusses the fundamentals of bit patterns. Chapter Topics: Patterns of bits. The number of patterns that can be formed for N bits.
A NaN (Not-a-Number) is a symbolic entity encoded in floating-point format. There are two types of NaNs: propagates through almost every arithmetic operation without signalling an exception
This is what §2.3.2 of the JVM 7 spec has to say about it:
The elements of the double value set are exactly the values that can be represented using the double floating-point format defined in the IEEE 754 standard, except that there is only one NaN value (IEEE 754 specifies 253-2 distinct NaN values).
and §2.8.1:
The Java Virtual Machine has no signaling NaN value.
So technically there is only one NaN. But §4.2.3 of the JLS also says (right after your quote):
For the most part, the Java SE platform treats NaN values of a given type as though collapsed into a single canonical value, and hence this specification normally refers to an arbitrary NaN as though to a canonical value.
However, version 1.3 of the Java SE platform introduced methods enabling the programmer to distinguish between NaN values: the Float.floatToRawIntBits and Double.doubleToRawLongBits methods. The interested reader is referred to the specifications for the Float and Double classes for more information.
Which I take to mean exactly what you and CandiedOrange propose: It is dependent on the underlying processor, but Java treats them all the same.
But it gets better: Apparently, it is entirely possible that your NaN values are silently converted to different NaNs, as described in Double.longBitsToDouble()
:
Note that this method may not be able to return a double NaN with exactly same bit pattern as the long argument. IEEE 754 distinguishes between two kinds of NaNs, quiet NaNs and signaling NaNs. The differences between the two kinds of NaN are generally not visible in Java. Arithmetic operations on signaling NaNs turn them into quiet NaNs with a different, but often similar, bit pattern. However, on some processors merely copying a signaling NaN also performs that conversion. In particular, copying a signaling NaN to return it to the calling method may perform this conversion. So longBitsToDouble may not be able to return a double with a signaling NaN bit pattern. Consequently, for some long values, doubleToRawLongBits(longBitsToDouble(start)) may not equal start. Moreover, which particular bit patterns represent signaling NaNs is platform dependent; although all NaN bit patterns, quiet or signaling, must be in the NaN range identified above.
For reference, there is a table of the hardware-dependant NaNs here. In summary:
- x86: quiet: Sign=0 Exp=0x7ff Frac=0x80000 signalling: Sign=0 Exp=0x7ff Frac=0x40000 - PA-RISC: quiet: Sign=0 Exp=0x7ff Frac=0x40000 signalling: Sign=0 Exp=0x7ff Frac=0x80000 - Power: quiet: Sign=0 Exp=0x7ff Frac=0x80000 signalling: Sign=0 Exp=0x7ff Frac=0x5555555500055555 - Alpha: quiet: Sign=0 Exp=0 Frac=0xfff8000000000000 signalling: Sign=1 Exp=0x2aa Frac=0x7ff5555500055555
So, to verify this you would really need one of these processors and go try it out. Also any insights on how to interpret the longer values for the Power and Alpha architectures are welcome.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With