Can anyone tell me why the internal representation of a nullptr assigned to a data member pointer of type Class::* is -1 for MSVC, clang and g++? For a 64-bit system (size_t)1 << 63 would be best, because if you use a nullptr member pointer that way you will touch kernel memory for sure and have a crash so this would be a nice debugging aid.
Is there a deeper reason behind -1?
Sample:
struct X
{
int x, y;
};
using member_ptr = int X::*;
member_ptr f()
{
return nullptr;
}
... results in the following binary with g++:
movq $-1, %rax
ret
There are three reasons why ~(0LLU) is preferable:
Member pointers can be anything from 0 to the size of the struct or class. Using ~(0LLU) has the least risk of colliding with an actually valid member pointer. You can't really have a struct the size of size_t:
<source>:2:21: error: size '9223372036854775808' of array 'x' exceeds maximum object size '9223372036854775807'
2 | long long x[1LLU<<63];
<source>:2:15: error: size of array 'x' exceeds maximum object size '9223372036854775807'
2 | long long x[1LLU<<62];
Note that limit is (1LLU<<63) - 1. So that kind of negates this argument. Might be different on a 16bit system.
On x86_64 loading a 0, ~(1LLU) and 1LLU << 63 becomes
31 ff xor %edi,%edi
48 c7 c7 ff ff ff ff mov $0xffffffffffffffff,%rdi
48 bf 00 00 00 00 00 00 00 80 movabs $0x8000000000000000,%rdi
Loading 0 is the fastest. Loading 1LLU << 63 is the longest opcode and that alone has performance costs. So using ~(0LLU) as the member pointer nullptr has a slight performance advantage.
It's similar on many architectures. On Mips64 the last needs a whole extra opcode: https://godbolt.org/z/3nehjcoM6
It's customary from the old C days that a function returns -1 or ~(0LLU) as error code except for pointers where 0 is used. Member pointers can't use 0.
Personally I think the compiler developers where just following old habits (reason 3). That it's also faster is just luck (or those old C geezers knew what they where doing choosing their error codes :).
As for why the compiler can't use ~(0LLU) when optimizing and 1LLU << 63 when debugging: You can compile some translation units as optimized code and some a debug code. They would then follow incompatible ABIs and couldn't be linked together.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With