Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Type-pun uint64_t as two uint32_t in C++20

This code to read a uint64_t as two uint32_t is UB due to the strict aliasing rule:

uint64_t v;
uint32_t lower = reinterpret_cast<uint32_t*>(&v)[0];
uint32_t upper = reinterpret_cast<uint32_t*>(&v)[1];

Likewise, this code to write the upper and lower part of an uint64_t is UB due to the same reason:

uint64_t v;
uint32_t* lower = reinterpret_cast<uint32_t*>(&v);
uint32_t* upper = reinterpret_cast<uint32_t*>(&v) + 1;

*lower = 1;
*upper = 1;

How can one write this code in a safe and clean way in modern C++20, potentially using std::bit_cast?

like image 224
Sebastian Hoffmann Avatar asked Nov 10 '21 10:11

Sebastian Hoffmann


People also ask

What does uint32_t mean in C?

uint32_t is a numeric type that guarantees 32 bits. The value is unsigned, meaning that the range of values goes from 0 to 232 - 1.

What does uint64_t mean in C++?

You are likely wondering what are uint8_t, uint16_t, uint32_t and uint64_t. That's a good question. Because it could be really helpul! It turns out that they are equal respectively to: unsigned char, unsigned short, unsigned int and unsigned long long.

What is uint64_t in C?

The UInt64 value type represents unsigned integers with values ranging from 0 to 18,446,744,073,709,551,615. Important. The UInt64 type is not CLS-compliant. The CLS-compliant alternative type is Decimal. Int64 can be used instead to replace a UInt64 value that ranges from zero to MaxValue.

Is there uint64_t?

A uint64_t is standardised in c++ and guarantees 64bits of storage, you need to include stdint. h to use it though. Just in case anyone was interested.

What does uint32_t mean in C?

What does uint32_t mean in C? Originally Answered: What is uint32_t in C? uint32_t is a data type which is an unsigned integer that is guaranteed to be exactly 32 bits wide, no matter what compiler implementation or settings you’re using — assuming the compiler supports it.

What is the difference between int *ptr and uint *PTR?

uint *ptr; int *ptr; is that "uint" is an unsigned integer. It means that this is a number from 0 to +4,294,967,295. Where as a "int" is from -2,147,483,648 to +2,147,483,647. The first cannot be a negative one, the second can. So when you use "uint *ptr" means that your pointer is pointing on a number from zero to +4,294,967,295.

What are intN_t and int_FASTn_T in C++?

They are from C99. intN_t and uintN_t have sizes of exactly N bits. int_fastN_t is the fastest integer type with width of at least N bits, and int_leastN_t is the type smallest integer type with width of at least N bits. But they are optional and implementations may choose not to defined them.

What is an exact 32 bit integer in C?

uint32_t is a data type which is an unsigned integer that is guaranteed to be exactly 32 bits wide, no matter what compiler implementation or settings you’re using — assuming the compiler supports it. This is one of several exact-width integer data types defined in stdint.h, introduced in the C99 (1999) standard.


Video Answer


2 Answers

Using std::bit_cast:

Try it online!

#include <bit>
#include <array>
#include <cstdint>
#include <iostream>

int main() {
    uint64_t x = 0x12345678'87654321ULL;
    // Convert one u64 -> two u32
    auto v = std::bit_cast<std::array<uint32_t, 2>>(x);
    std::cout << std::hex << v[0] << " " << v[1] << std::endl;
    // Convert two u32 -> one u64
    auto y = std::bit_cast<uint64_t>(v);
    std::cout << std::hex << y << std::endl;
}

Output:

87654321 12345678
1234567887654321

std::bit_cast is available only in C++20. Prior to C++20 you can manually implement std::bit_cast through std::memcpy, with one exception that such implementation is not constexpr like C++20 variant:

template <class To, class From>
inline To bit_cast(From const & src) noexcept {
    //return std::bit_cast<To>(src);
    static_assert(std::is_trivially_constructible_v<To>,
        "Destination type should be trivially constructible");
    To dst;
    std::memcpy(&dst, &src, sizeof(To));
    return dst;
}

For this specific case of integers quite optimal would be just to do bit shift/or arithmetics to convert one u64 to two u32 and back again. std::bit_cast is more generic, supporting any trivially constructible type, although std::bit_cast solution should be same optimal as bit arithmetics on modern compilers with high level of optimization.

One extra profit of bit arithmetics is that it handles correctly endianess, it is endianess independent, unlike std::bit_cast.

Try it online!

#include <cstdint>
#include <iostream>

int main() {
    uint64_t x = 0x12345678'87654321ULL;
    // Convert one u64 -> two u32
    uint32_t lo = uint32_t(x), hi = uint32_t(x >> 32);
    std::cout << std::hex << lo << " " << hi << std::endl;
    // Convert two u32 -> one u64
    uint64_t y = (uint64_t(hi) << 32) | lo;
    std::cout << std::hex << y << std::endl;
}

Output:

87654321 12345678
123456788765432

Notice! As @Jarod42 points out, solution with bit shifting is not equivalent to memcpy/bit_cast solution, their equivalence depends on endianess. On little endian CPU memcpy/bit_cast gives least significant half (lo) as array element v[0] and most significant (hi) in v[1], while on big endian least significant (lo) goes to v[1] and most significant goes to v[0]. While bit-shifting solution is endianess independent, and on all systems gives most significant half (hi) as uint32_t(num_64 >> 32) and least significant half (lo) as uint32_t(num_64).

like image 149
Arty Avatar answered Oct 24 '22 23:10

Arty


in a safe and clean way

Do not use reinterpret_cast. Do not depend on unclear code that depends on some specific compiler settings and fishy, uncertain behavior. Use exact arithmetic operations with well-known defined result. Classes and operator overloads are all there waiting for you. For example, some global functions:

#include <iostream>

struct UpperUint64Ref {
   uint64_t &v;
   UpperUint64Ref(uint64_t &v) : v(v) {}
   UpperUint64Ref operator=(uint32_t a) {
      v &= 0x00000000ffffffffull;
      v |= (uint64_t)a << 32;
      return *this;
   }
   operator uint64_t() {
      return v;
   }
};
struct LowerUint64Ref { 
    uint64_t &v;
    LowerUint64Ref(uint64_t &v) : v(v) {}
    /* as above */
};
UpperUint64Ref upper(uint64_t& v) { return v; }
LowerUint64Ref lower(uint64_t& v) { return v; }

int main() {
   uint64_t v;
   upper(v) = 1;
}

Or interface object:

#include <iostream>

struct Uint64Ref {
   uint64_t &v;
   Uint64Ref(uint64_t &v) : v(v) {}
   struct UpperReference {
       uint64_t &v;
       UpperReference(uint64_t &v) : v(v) {}
       UpperReference operator=(uint32_t a) {
           v &= 0x00000000ffffffffull;
           v |= (uint64_t)a << 32u;
       }
   };
   UpperReference upper() {
      return v;
   }
   struct LowerReference {
       uint64_t &v;
       LowerReference(uint64_t &v) : v(v) {}
   };
   LowerReference lower() { return v; }
};
int main() {
   uint64_t v;
   Uint64Ref r{v};
   r.upper() = 1;
}
like image 3
KamilCuk Avatar answered Oct 24 '22 21:10

KamilCuk