Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing data in the most significant bits of a pointer

Tags:

c++

c

pointers

I've found quite some information about tagged pointers, which (ab)use the alignment requirements of types to store bits of data in their unused least significant bits.

I was wondering however, couldn't you do the same with the most significant bits on a 64 bit system? Even if you were to use the 16 most significant bits of a 64 bit pointer, you would still need more than 256 terabytes of RAM for them to overlap.

I know that in theory this is undefined behavior, but how would this behave in practice on some of the common operating systems (Windows/Max/Linux)?

And yes, I am aware that this is evil and dangerous, but that is not what this question is about. It is a "what if" question about pushing computer programs to their limits, not one about sane and portable software design.

like image 279
Rick de Water Avatar asked Jan 07 '23 02:01

Rick de Water


1 Answers

If you know your exact memory layout you can probably do it, but it's risky. The most common 64 bits systems for Windows/Mac/Linux are amd64. On them the machine only has 48 bit virtual addresses (for the foreseeable future), so you have 16 bits to play around in plus the lower aligned bits, theoretically.

Except. Half of the address space is negative; addresses go between [-2^47,2^47). So you can't be sure if the bits set in the pointer actually mean that your magic bits are set or you just have a negative address.

Except. Today, most, if not all operating systems put the kernel in the negative address space and put the userland in the positive address space. It makes certain things easier and faster to manage. So you could abuse that knowledge to assume that playing with those bits should be safe.

Except. I've never seen a guarantee from any operating system that this situation will remain forever (doesn't mean that one doesn't exist, I just haven't seen one). You might update your kernel one day and suddenly the operating system decided that userland is negative and kernel is positive‚ or userland gets more address space to play around in.

As long as you mask out the extra bits before you dereference your pointers, you will be safe today, but maybe not tomorrow. And when you build your code around an assumption like this, you deserve all the pain you get when your undefined behavior you get away with becomes undefined behavior you don't get away with. Painting yourself into a corner like this is not fun.

like image 160
Art Avatar answered Jan 17 '23 22:01

Art