Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Portable tagged pointers

Tags:

c++

c

pointers

Is there a portable way to implement a tagged pointer in C/C++, like some documented macros that work across platforms and compilers? Or when you tag your pointers you are at your own peril? If such helper functions/macros exist, are they part of any standard or just are available as open source libraries?

Just for those who do not know what tagged pointer is but are interested, it is a way to store some extra data inside a normal pointer, because on most architectures some bits in pointers are always 0 or 1, so you keep your flags/types/hints in those extra bits, and just erase them right before you want to use pointer to dereference some actual value.

const int gc_flag = 1;
const int flag_mask = 7; // aka 0b00000000000111, because on some theoretical CPU under some arbitrary OS compiled with some random compiler and using some particular malloc last three bits are always zero in pointers.

struct value {
   void *data;
};

struct value val;
val.data = &data | gc_flag;
int data = *(int*)(val.data & flag_mask);

https://en.wikipedia.org/wiki/Pointer_tagging

like image 835
exebook Avatar asked Nov 26 '17 21:11

exebook


People also ask

What is a tagged pointer?

In computer science, a tagged pointer is a pointer (concretely a memory address) with additional data associated with it, such as an indirection bit or reference count. This additional data is often "folded" into the pointer, meaning stored inline in the data representing the address, taking advantage of certain properties of memory addressing.

How many bits are in a tagged pointer?

For example, a tagged pointer could have any of the bottom four bits set. You can have even more tag bits by taking advantage of additional architecture peculiarities. For example, x86_64 actually only uses 48 bits of a pointer, leaving you with 16 bits free on top of the 4 you get due to alignment.

How do you fold a tag into a pointer?

There are various techniques for folding tags into a pointer. Most architectures are byte-addressable (the smallest addressable unit is a byte), but certain types of data will often be aligned to the size of the data, often a word or multiple thereof.

How do you read a value from a tagged pointer?

For tagged pointers, it first reads the value from ReadTaggedPointer. This pops out as an unsigned long long, and so needs some work in the event that the actual type is signed: It also creates a local variable of type union Value to hold the return value: If the value is unsigned, then life is simple: simply stuff value into v.


2 Answers

You can get the lowest N bits of an address for your personal use by guaranteeing that the objects are aligned to multiples of 1 << N. This can be achieved platform-independently by different ways (alignas and aligned_storage for stack-based objects or std::aligned_alloc for dynamic objects), depending on what you want to achieve:

struct Data { ... };

alignas(1 << 4) Data d; // 4-bits, 16-byte alignment
assert(reinterpret_cast<std::uintptr_t>(&d) % 16 == 0);

// dynamic (preferably with a unique_ptr or alike)
void* ptr = std::aligned_alloc(1 << 4, sizeof(Data));
auto obj = new (ptr) Data;
...
obj->~Data();
std::free(ptr);

You pay by throwing away a lot of memory, exponentionally growing with the number of bits required. Also, if you plan to allocate many of such objects contiguously, such an array won't fit in the processor's cacheline for comparatively small arrays, possibly slowing down the program considerably. This solution therefore is not to scale.

like image 147
Jodocus Avatar answered Sep 26 '22 00:09

Jodocus


If you're sure that the addresses you are passing around always have certain bits unused, then you could use uintptr_t as a transport type. This is an integer type that maps to pointers in the expected way (and will fail to exist on an obscure platform that offers no such possible map).

There aren't any standard macros but you can roll your own easily enough. The code (sans macros) might look like:

void T_func(uintptr_t t)
{
    uint8_t tag = (t & 7);
    T *ptr = (T *)(t & ~(uintptr_t)7);

    // ...
}

int main()
{
    T *ptr = new T;
    assert( ((uintptr_t)ptr % 8) == 0 );
    T_func( (uintptr_t)ptr + 3 );
}

This may defeat compiler optimizations that involve tracking pointer usage.

like image 41
M.M Avatar answered Sep 24 '22 00:09

M.M