If there is for example a class that requires a pointer and a <code>bool</code>. For simplicity an <code>int</code> pointer will be used in examples, but the pointer type is irrelevant as long as it points to something whose <code>size()</code> is more than 1 . Defining the class with <code>{ bool , int *}</code> data members will result in the class having a size that is double the size of the pointer and a lot of wasted space If the pointer does not point to a <code>char</code> (or other data of <code>size(1)</code>), then presumably the low bit will always be zero. The class could defined with <code>{int *}</code> or for convenience: <code>union { int *, uintptr_t }</code> The <code>bool</code> is implemented by setting/clearing the low bit of the pointer as per the logical <code>bool</code> value and clearing the bit when you need to use the pointer. The defined way: <pre class="prettyprint"><code>struct myData { int * ptr; bool flag; }; myData x; // initialize x.ptr = new int; x.flag = false; // set flag true x.flag = true; // set flag false x.flag = false; // use ptr *(x.ptr)=7; // change ptr x = y; // y is another int * </code></pre> And the proposed way: <pre class="prettyprint"><code>union tiny { int * ptr; uintptr_t flag; }; tiny x; // initialize x.ptr = new int; // set flag true x.flag |= 1; // set flag false x.flag &= ~1; // use ptr tiny clean=x; // note that clean will likely be optimized out clean.flag &= ~1; // back to original value as assigned to ptr *(clean.ptr)=7; // change ptr bool flag=x.flag; x.ptr = y; // y is another int * x.flag |= flag; </code></pre> This seems to be undefined behavior, but how portable is this?

It's very portable, and furthermore, you can <code>assert</code> when you accept the raw pointer to make sure it meets the alignment requirement. This will insure against the unfathomable future compiler that somehow messes you up. Only reasons not to do it are the readability cost and general maintenance associated with "hacky" stuff like that. I'd shy away from it unless there's a clear gain to be made. But it is sometimes totally worth it.

How portable is using the low bit of a pointer as a flag?

Tags:

c++

optimization

pointers

portability

If there is for example a class that requires a pointer and a bool. For simplicity an int pointer will be used in examples, but the pointer type is irrelevant as long as it points to something whose size() is more than 1 .

Defining the class with { bool , int *} data members will result in the class having a size that is double the size of the pointer and a lot of wasted space

If the pointer does not point to a char (or other data of size(1)), then presumably the low bit will always be zero. The class could defined with {int *} or for convenience: union { int *, uintptr_t }

The bool is implemented by setting/clearing the low bit of the pointer as per the logical bool value and clearing the bit when you need to use the pointer.

The defined way:

struct myData
{
 int * ptr;
 bool flag;
};
myData x;

// initialize
x.ptr = new int;
x.flag = false;

// set flag true
x.flag = true;

// set flag false
x.flag = false;

// use ptr
*(x.ptr)=7;

// change ptr
x = y;                // y is another int *

And the proposed way:

union tiny
{
 int * ptr;
 uintptr_t flag;
};
tiny x;

// initialize
x.ptr = new int;

// set flag true
x.flag |= 1;

// set flag false
x.flag &= ~1;

// use ptr
tiny clean=x;      // note that clean will likely be optimized out
clean.flag &= ~1;  // back to original value as assigned to ptr
*(clean.ptr)=7;

// change ptr
bool flag=x.flag;
x.ptr = y;             // y is another int *
x.flag |= flag;

This seems to be undefined behavior, but how portable is this?

462

asked Nov 15 '13 00:11

Glenn Teitelbaum

3 Answers

As long as you restore the pointer's low-order bit before trying to use it as a pointer, it's likely to be "reasonably" portable, as long as your system, your C++ implementation, and your code meet certain assumptions.

I can't necessarily give you a complete list of assumptions, but off the top of my head:

It assumes you're not pointing to anything whose size is 1 byte. This excludes char, unsigned char, signed char, int8_t, and uint8_t. (And that assumes CHAR_BIT == 8; on exotic systems with, say, 16-bit or 32-bit bytes, other types might be excluded.)
It assumes objects whose size is at least 2 bytes are always aligned at an even address. Note that x86 doesn't require this; you can access a 4-byte int at an odd address, but it will be slightly slower. But compilers typically arrange for objects to be stored at even addresses. Other architectures may have different requirements.
It assumes a pointer to an even address has its low-order bit set to 0.

For that last assumption, I actually have a concrete counterexample. On Cray vector systems (J90, T90, and SV1 are the ones I've used myself) a machine address points to a 64-bit word, but the C compiler under Unicos sets CHAR_BIT == 8. Byte pointers are implemented in software, with the 3-bit byte offset within a word stored in the otherwise unused high-order 3 bits of the 64-bit pointer. So a pointer to an 8-byte aligned object could have easily its low-order bit set to 1.

There have been Lisp implementations (example) that use the low-order 2 bits of pointers to store a type tag. I vaguely recall this causing serious problems during porting.

Bottom line: You can probably get away with it for most systems. Future architectures are largely unpredictable, and I can easily imagine your scheme breaking on the next Big New Thing.

Some things to consider:

Can you store the boolean values in a bit vector outside your class? (Maintaining the association between your pointer and the corresponding bit in the bit vector is left as an exercise).

Consider adding code to all pointer operations that fails with an error message if it ever sees a pointer with its low-order bit set to 1. Use #ifdef to remove the checking code in your production version. If you start running into problems on some platform, build a version of your code with the checks enabled and see what happens.

I suspect that, as your application grows (they seldom shrink), you'll want to store more than just a bool along with your pointer. If that happens, the space issue goes away, because you're already using that extra space anyway.

189

answered Sep 24 '22 17:09

Keith Thompson

In "theory": it's undefined behavior as far as I know.

In "reality": it'll work on everyday x86/x64 machines, and probably ARM too?
I can't really make a statement beyond that.

answered Sep 25 '22 17:09

user541686

It's very portable, and furthermore, you can assert when you accept the raw pointer to make sure it meets the alignment requirement. This will insure against the unfathomable future compiler that somehow messes you up.

Only reasons not to do it are the readability cost and general maintenance associated with "hacky" stuff like that. I'd shy away from it unless there's a clear gain to be made. But it is sometimes totally worth it.

answered Sep 23 '22 17:09

VoidStar

Related questions
                            
                                Using lambda as an argument : std::function or template?
                            
                                C++ list iterator never reaches end() when iterating through
                            
                                Private and default constructor in C++11 and gcc
                            
                                Searching/Iterating boost::spirit::qi::symbols
                            
                                C++ iterator and reverse iterator
                            
                                Why the interface sqlite3_get_table in SQLite C Interface is not recommended
                            
                                How can I generate a compile-time array of interrupt handlers in C++?
                            
                                HDF5 Compound type Native vs. IEEE
                            
                                Pointed template type deduced from a nullptr?
                            
                                C++ Read File with multiple column
                            
                                Behaviour of explicit call to destructor
                            
                                Static import in C++11 (e.g. an enum class)
                            
                                Passing a C# double array to a C++ function using CLI
                            
                                Qt 5.0 - Is it a viable option for 3D game / application development [closed]
                            
                                Mat element bulk modification : negative to 0, positive to 1
                            
                                How to resize a Vector in Eigen3
                            
                                C++11: Universal executor
                            
                                Const correctness -- C API shim layer
                            
                                Is it possible to define another preprocessor directive?
                            
                                C++ float array initialization [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With