Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can storing unrelated data in the least-significant-bit of a pointer work reliably?

Let me just say up front that what I'm aware that what I'm about to propose is a mortal sin, and that I will probably burn in Programming Hell for even considering it.

That said, I'm still interested in knowing if there's any reason why this wouldn't work.

The situation is: I have a reference-counting smart-pointer class that I use everywhere. It currently looks something like this (note: incomplete/simplified pseudocode):

class IRefCountable
{
public:
    IRefCountable() : _refCount(0) {}
    virtual ~IRefCountable() {}

    void Ref() {_refCount++;}
    bool Unref() {return (--_refCount==0);}

private:
    unsigned int _refCount;
};

class Ref
{
public:
   Ref(IRefCountable * ptr, bool isObjectOnHeap) : _ptr(ptr), _isObjectOnHeap(isObjectOnHeap) 
   { 
      _ptr->Ref();
   }

   ~Ref() 
   {
      if ((_ptr->Unref())&&(_isObjectOnHeap)) delete _ptr;
   }

private:
   IRefCountable * _ptr;
   bool _isObjectOnHeap;
};

Today I noticed that sizeof(Ref)=16. However, if I remove the boolean member variable _isObjectOnHeap, sizeof(Ref) is reduced to 8. That means that for every Ref in my program, there are 7.875 wasted bytes of RAM... and there are many, many Refs in my program.

Well, that seems like a waste of some RAM. But I really need that extra bit of information (okay, humor me and assume for the sake of the discussion that I really do). And I notice that since IRefCountable is a non-POD class, it will (presumably) always be allocated on a word-aligned memory address. Therefore, the least significant bit of (_ptr) should always be zero.

Which makes me wonder... is there any reason why I can't OR my one bit of boolean data into the least-significant bit of the pointer, and thus reduce sizeof(Ref) by half without sacrificing any functionality? I'd have to be careful to AND out that bit before dereferencing the pointer, of course, which would make pointer dereferences less efficient, but that might be made up for by the fact that the Refs are now smaller, and thus more of them can fit into the processor's cache at once, and so on.

Is this a reasonable thing to do? Or am I setting myself up for a world of hurt? And if the latter, how exactly would that hurt be visited upon me? (Note that this is code that needs to run correctly in all reasonably modern desktop environments, but it doesn't need to run in embedded machines or supercomputers or anything exotic like that)

like image 626
Jeremy Friesner Avatar asked Jun 13 '11 03:06

Jeremy Friesner


People also ask

Are pointers memory efficient?

If you use them carefully, pointers can reduce the amount of program code you need to write, thereby increasing your program's efficiency and enabling you to use less memory. (Your program can run faster because it does not have to duplicate the data in memory).

Do pointers take up less memory?

A pointer is stored in as many bytes as required to hold an address on the computer. This often makes pointers much smaller than the things they point to. We take advantage of this small size when storing data and when passing parameters to functions.

Why can a pointer of one data type not be used to point to a variable of another data type?

A pointer is a variable whose value is the address of another variable, i.e., direct address of the memory location. Like any variable or constant, you must declare a pointer before you can use it to store any variable address. The data type of pointer must be same as the variable, which the pointer is pointing.

Do pointers store data?

A pointer is a variable that stores a memory address. Pointers are used to store the addresses of other variables or memory items. Pointers are very useful for another type of parameter passing, usually referred to as Pass By Address. Pointers are essential for dynamic memory allocation.


1 Answers

The problem here is that it is entirely machine-dependent. It isn't something one often sees in C or C++ code, but it has certainly been done many times in assembly. Old Lisp interpreters almost always used this trick to store type information in the low bit(s). (I have seen int in C code, but in projects that were being implemented for a specific target platform.)

Personally, if I were trying to write portable code, I probably wouldn't do this. The fact is that it will almost certainly work on "all reasonably modern desktop environments". (Certainly, it will work on every one I can think of.)

A lot depends on the nature of your code. If you are maintaining it, and nobody else will ever have to deal with the "world of hurt", then it might be ok. You will have to add ifdef's for any odd architecture that you might need to support later on. On the other hand, if you are releasing it to the world as "portable" code, that would be cause for concern.

Another way to handle this is to write two versions of your smart pointer, one for machines on which this will work and one for machines where it won't. That way, as long as you maintain both versions, it won't be that big a deal to change a config file to use the 16-byte version.

It goes without saying that you would have to avoid writing any other code that assumes sizeof(Ref) is 8 rather than 16. If you are using unit tests, run them with both versions.

like image 50
andrewdski Avatar answered Sep 22 '22 17:09

andrewdski