Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reinterpret casted value varies by compiler

Tags:

c++

c++11

For the same program:

const char* s = "abcd";
auto x1 = reinterpret_cast<const int64_t*>(s);
auto x2 = reinterpret_cast<const char*>(x1);
std::cout << *x1 << std::endl;
std::cout << x2 << std::endl; // Always "abcd"

In gcc5(link): 139639660962401
In gcc8(link): 1684234849

  1. Why does the value vary according to different compiler versions?
  2. What is then a compiler safe way to move from const char* to int64_t and backward(just like in this problem - not for actual integer strings but one with other chars as well)?
like image 810
tangy Avatar asked Jan 17 '19 19:01

tangy


2 Answers

  1. Why does the value vary according to different compiler versions?

Behaviour is undefined.

  1. What is then a compiler safe way to move from const char* to int64_t and backward

It is somewhat unclear what you mean by "move from const char* to int64_t". Based on the example, I assume you mean to create a mapping from a character sequence (of no greater length than fits) into a 64 bit integer in a way that can be converted back using another process - possibly compiled by another (version of) compiler.

First, create a int64_tobject, initialise to zero:

int64_t i = 0;

Get length of the string

auto len = strlen(s);

Check that it fits

assert(len < sizeof i);

Copy the bytes of the character sequence onto the integer

memcpy(&i, s, len);

(As long as the integer type doesn't have trap representations) The behaviour is well defined, and the generated integer will be the same across compiler versions as long as the CPU endianness (and negative number representation) remains the same.

Reading the character string back doesn't require copying because char is exceptionally allowed to alias all other types:

auto back = reinterpret_cast<char*>(&i);

Note the qualification in the last section. This method does not work if the integer is passed (across the network for example) to process running on another CPU. That can be achieved as well by bit shifting and masking so that you copy octets to certain position of significance using bit shifting and masking.

like image 140
eerorika Avatar answered Oct 05 '22 22:10

eerorika


When you dereference the int64_t pointer, it is reading past the end of the memory allocated for the string you casted from. If you changed the length of the string to at least 8 bytes, the integer value would become stable.

const char* s = "abcdefg"; // plus null terminator
auto x1 = reinterpret_cast<const int64_t*>(s);
auto x2 = reinterpret_cast<const char*>(x1);
std::cout << *x1 << std::endl;
std::cout << x2 << std::endl; // Always "abcd"

If you wanted to store the pointer in an integer instead, you should use intptr_t and leave out the * like:

const char* s = "abcd";
auto x1 = reinterpret_cast<intptr_t>(s);
auto x2 = reinterpret_cast<const char*>(x1);
std::cout << x1 << std::endl;
std::cout << x2 << std::endl; // Always "abcd"
like image 41
nate Avatar answered Oct 06 '22 00:10

nate