Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is transmuting bytes to a float safe or might it produce undefined behavior?

Are there byte-sequences that, when transmuted into either f32 or f64, produce undefined-behavior in Rust? I'm counting non-finite values, such as NaN, Infinity, etc. as valid floating point values.

The comments to this answer hint that there may be some problem converting a float from raw bytes.

like image 388
ideasman42 Avatar asked Jan 04 '23 01:01

ideasman42


1 Answers

The Rust reference provides a good list of situations where undefined behavior occurs. Of those, the one that most closely relates to the question is the following:

Invalid values in primitive types, even in private fields/locals:

  • Dangling/null references or boxes
  • A value other than false (0) or true (1) in a bool
  • A discriminant in an enum not included in the type definition
  • A value in a char which is a surrogate or above char::MAX
  • Non-UTF-8 byte sequences in a str

And still, floating point types are not listed. This is because any bit sequence (32 bits for f32; 64 bits for f64) is a valid state for a floating point value, in accordance to the IEEE 754-2008 binary32 and binary64 floating-point types. They might not be normal (other classes are zero, subnormal, infinite, or not a number), but still valid nonetheless.

In the end though, there should always be Another Way around transmute. In particular, the byteorder crate provides a safe and intuitive way to read numbers from a stream of bytes.

use byteorder::{ByteOrder, LittleEndian}; // or NativeEndian

let bytes = [0x00u8, 0x00, 0x80, 0x7F];
let number = LittleEndian::read_f32(&bytes);
println!("{}", number);

Playground


Ok, there actually is a very peculiar edge case where transmuting bits to a float can result in a signalling NaN, which in some CPU architectures and configurations will trigger a low-level exception. See the discussion in rust#39271 for details. It is currently known that materializing signalling NaNs is not undefined behavior, and that if floating point exceptions are enabled, which are not by default, this is unlikely to be a problem.

The already implemented decision from the Rust library team is that transmuting to a float is safe, even without any kind of masking. The reasoning is very well described in the documentation for f32::from_bits:

This is currently identical to transmute::<u32, f32>(v) on all platforms. It turns out this is incredibly portable, for two reasons:

  • Floats and Ints have the same endianness on all supported platforms.
  • IEEE-754 very precisely specifies the bit layout of floats.

However there is one caveat: prior to the 2008 version of IEEE-754, how to interpret the NaN signaling bit wasn't actually specified. Most platforms (notably x86 and ARM) picked the interpretation that was ultimately standardized in 2008, but some didn't (notably MIPS). As a result, all signaling NaNs on MIPS are quiet NaNs on x86, and vice-versa.

Rather than trying to preserve signaling-ness cross-platform, this implementation favours preserving the exact bits. This means that any payloads encoded in NaNs will be preserved even if the result of this method is sent over the network from an x86 machine to a MIPS one.

If the results of this method are only manipulated by the same architecture that produced them, then there is no portability concern.

If the input isn't NaN, then there is no portability concern.

If you don't care about signalingness (very likely), then there is no portability concern.

Some parsing/encoding libraries may still be converting all kinds of NaN to an assuredly quiet NaN, as this matter was uncertain for a while in the history of Rust.

like image 91
E_net4 stands with Ukraine Avatar answered Jan 13 '23 16:01

E_net4 stands with Ukraine