Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is “cast from ‘X*’ to ‘Y’ loses precision” a hard error and what is suitable fix for legacy code

1. Why?

Code like this used to work and it's kind of obvious what it is supposed to mean. Is the compiler even allowed (by the specification) to make it an error?

I know that it's loosing precision and I would be happy with a warning. But it still has a well-defined semantics (at least for unsigned downsizing cast is defined) and the user just might want to do it.

2. Workaround

I have legacy code that I don't want to refactor too much because it's rather tricky and already debugged. It is doing two things:

  1. Sometimes stores integers in pointer variables. The code only casts the pointer to integer if it stored an integer in it before. Therefore while the cast is downsizing, the overflow never happens in reality. The code is tested and works.

    When integer is stored, it always fits in plain old unsigned, so changing the type is not considered a good idea and the pointer is passed around quite a bit, so changing it's type would be somewhat invasive.

  2. Uses the address as hash value. A rather common thing to do. The hash table is not that large to make any sense to extend the type.

    The code uses plain unsigned for hash value, but note that the more usual type of size_t may still generate the error, because there is no guarantee that sizeof(size_t) >= sizeof(void *). On platforms with segmented memory and far pointers, size_t only has to cover the offset part.

So what are the least invasive suitable workarounds? The code is known to work when compiled with compiler that does not produce this error, so I really want to do the operation, not change it.


Notes:

void *x;
int y;
union U { void *p; int i; } u;
  1. *(int*)&x and u.p = x, u.i are not equivalent to (int)x and are not the opposite of (void *)y. On big endian architectures, the first two will return the bytes on lower addresses while the later will work on low order bytes, which may reside on higher addresses.
  2. *(int*)&x and u.p = x, u.i are both strict aliasing violations, (int)x is not.
like image 922
Jan Hudec Avatar asked Feb 05 '14 10:02

Jan Hudec


3 Answers

C++, 5.2.10:

4 - A pointer can be explicitly converted to any integral type large enough to hold it. [...]

C, 6.3.2.3:

6 - Any pointer type may be converted to an integer type. [...] If the result cannot be represented in the integer type, the behavior is undefined. [...]

So (int) p is illegal if int is 32-bit and void * is 64-bit; a C++ compiler is correct to give you an error, while a C compiler may either give an error on translation or emit a program with undefined behaviour.

You should write, adding a single conversion:

(int) (intptr_t) p

or, using C++ syntax,

static_cast<int>(reinterpret_cast<intptr_t>(p))

If you're converting to an unsigned integer type, convert via uintptr_t instead of intptr_t.

like image 67
ecatmur Avatar answered Oct 15 '22 18:10

ecatmur


This is a tough one to solve "generically", because the "looses precision" indicates that your pointers are larger than the type you are trying to store it in. Which may well be "ok" in your mind, but the compiler is concerned that you will be restoring the int value back into a pointer, which has now lost the upper 32 bits (assuming we're talking 32-bit int and 64-bit pointers - there are other possible combinations).

There is uintptr_t that is size-compatible with whatever the pointer is on the systems, so typically, you can overcome the actual error by:

int x = static_cast<int>(reinterpret_cast<uintptr_t>(some_ptr));

This will first force a large integer from a pointer, and then cast the large integer to a smaller type.

like image 4
Mats Petersson Avatar answered Oct 15 '22 19:10

Mats Petersson


Answer for C

Converting pointers to integers is implementation defined. Your problem is that the code that you are talking about seems never have been correct. And probably only worked on ancient architectures where both int and pointers are 32 bit.

The only types that are supposed to convert without loss are [u]intptr_t, if they exist on the platform (usually they do). Which part of such an uintptr_t is appropriate to use for your hash function is difficult to tell, you shouldn't make any assumptions on that. I would go for something like

uintptr_t n = (uintptr_t)x;

and then

((n >> 32) ^ n) & UINT32_MAX

this can be optimized out on 32 bit archs, and would give you traces of all other bits on 64 bit archs.

For C++ basically the same should apply, just the cast would be reinterpret_cast<std:uintptr_t>(x).

like image 3
Jens Gustedt Avatar answered Oct 15 '22 19:10

Jens Gustedt