Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Emulating a packed structure in portable C

I have the following structure:

typedef struct Octree {
    uint64_t *data;
    uint8_t alignas(8) alloc;
    uint8_t dataalloc;
    uint16_t size, datasize, node0;
    // Node8 is a union type with of size 16 omitted for brevity
    Node8 alignas(16) node[]; 
} Octree;

In order for the code that operates on this structure to work as intended, it is necessary that node0 immediately precedes the first node such that ((uint16_t *)Octree.node)[-1] will access Octree.node0. Each Node8 is essentially a union holding 8 uint16_t. With GCC I could force pack the structure with #pragma pack(push) and #pragma pack(pop). However this is non-portable. Another option is to:

  • Assume sizeof(uint64_t *) <= sizeof(uint64_t)
  • Store the structure as just 2 uint64_t followed immediately by the node data, and the members are accessed manually via bitwise arithmetic and pointer casts

This option is quite impractical. How else could I define this 'packed' data structure in a portable way? Are there any other ways?

like image 936
CPlus Avatar asked Dec 21 '25 09:12

CPlus


2 Answers

The C language standard does not allow you to specify a struct's memory layout down to the last bit. Other languages do (Ada and Erlang come to mind), but C does not.

So if you want actual portable standard C, you specify a C struct for your data, and convert from and to specific memory layout using pointers, probably composing from and decomposing into a lot of uint8_t values to avoid endianness issues. Writing such code is error prone, requires duplicating memory, and depending on your use case, it can be relatively expensive in both memory and processing.

If you want direct access to a memory layout via a struct in C, you need to rely on compiler features which are not in the C language specification, and therefore are not "portable C".

So the next best thing is to make your C code as portable as possible while at the same time preventing compilation of that code for incompatible platforms. You define the struct and provide platform/compiler specific code for each and every supported combination of platform and compiler, and the code using the struct can be the same on every platform/compiler.

Now you need to make sure that it is impossible to accidentally compile for a platform/compiler where the memory layout is not exactly the one your code and your external interface require.

Since C11, that is possible using static_assert, sizeof and offsetof.

So something like the following should do the job if you can require C11 (I presume you can require C11 as you are using alignas which is not part of C99 but is part of C11). The "portable C" part here is you fixing the code for each platform/compiler where the compilation fails due to one of the static_assert declarations failing.

#include <assert.h>
#include <stdalign.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>

typedef uint16_t Node8[8];

typedef struct Octree {
    uint64_t *data;
    uint8_t alignas(8) alloc;
    uint8_t dataalloc;
    uint16_t size, datasize, node0;
    Node8 alignas(16) node[];
} Octree;

static_assert(0x10 == sizeof(Octree),              "Octree size error");
static_assert(0x00 == offsetof(Octree, data),      "Octree data position error");
static_assert(0x08 == offsetof(Octree, alloc),     "Octree alloc position error");
static_assert(0x09 == offsetof(Octree, dataalloc), "Octree dataalloc position error");
static_assert(0x0a == offsetof(Octree, size),      "Octree size position error");
static_assert(0x0c == offsetof(Octree, datasize),  "Octree datasize position error");
static_assert(0x0e == offsetof(Octree, node0),     "Octree node0 position error");
static_assert(0x10 == offsetof(Octree, node),      "Octree node[] position error");

The series of static_assert declarations could be written more concisely with less redundant source code typing for the error messages using a preprocessor macro stringifying the struct name, member name, and maybe size/offset value.

Now that we have nailed down the struct member sizes and offsets within the struct, two aspects still need checks.

  • The integer endianness your code expects is the same endianness your memory structure contains. If the endianness happens to be "native", you have nothing to check for or to handle conversions. If the endianness is "big endian" or "little endian", you need to add some checks and/or do conversions.

  • As noted in the comments to the question, you will need to verify separately that the undefined behaviour &(((uint16_t *)octree.node)[-1]) == &octree.node0 actually is what you expect it to be on this compiler/platform.

    Ideally, you would find a way to write this as a separate static_assert declaration. However, such a test is quick and short enough that you can add such a check to the runtime code in a rarely but guaranteed to be run function like a global initialization function, a library initialization functions, or maybe even a constructor. Do be cautious though if you use the assert() macro for that check, as that runtime check will turn into a no-op if the NDEBUG macro is defined.

like image 56
ndim Avatar answered Dec 23 '25 21:12

ndim


TL;DR: get rid of the alignas and nest your structs as deep as needed, and reorder/reorganize the layout to compact the memory on commonly seen "typical" architectures. It’s not too rare I have 6 or 7 layers mixing structs and unions in order to get the layout to fall into place just right on common architectures, yet still be portable to rare architectures. This is common practice, it doesn’t hurt performance, and it doesn’t decrease readability. Also:

  1. I avoid alignas and opt for padding properties, which are more flexible and often self-documenting.
  2. Zero-length struct fields can be useful on occasion but are often over-used. Setting a length of one and subtracting sizeof the flexible member ensures portability and avoids differing alignment behavior between some compilers.
typedef struct Octree_payload {
    // These are guaranteed to be in order
    //   with no padding everywhere.
    uint8_t data, dataalloc;
    uint16_t size, datasize, node0;
} Octree_payload_t;

typedef union Octree_union {
    // Only access payload at index -1
    Octree_payload_t _pl[1];
    // Node8 is a union type with of size 16
    Node8 node[1];
} Octree_union_u;

// Usage ex.: OCTREE_PAYLOAD(x).datasize
#define OCTREE_PAYLOAD(o) ((o).octu._pl[-1])
typedef struct Octree {
    uint64_t *data;
    Octree_payload_t _payload_pad; //do not access!
    Octree_union_u octu;
} Octree_t;

Observe: the above setup achieves exactly what you asked for completely portably and elegantly. The _payload_pad property pushes the offset of octu at least sizeof(Octree_payload_t) or 8 bytes plus any extra amount of space necessary for proper field alignment depending on the architecture. Then, we access the payload fields offset index -1 relative to the first Node8 via OCTREE_PAYLOAD, which is 100% safe and portable. (Yes, theoretically this is undefined behavior, however, in practice, it is quite well defined for all real-world compilers on all architectures; see the next paragraph. The reason it’s well-defined is that compilers will reconcile the alignment of both array properties of Octree_union_u when calculating the offset of octu in Octree_t, meaning it’s safe to index both array properties—_pl and node—including at negative indexes.)

What about the undefined behavior of unions? What about the C standard permitting non-power-2 pointers or non-8-bit bytes? It’s not practical to write any non-trivial C program to take into account every possible environment & interpretation of the C standard. In the case of unions, undefined behavior translates to the underlying representation may be exposed (and casting it can give unexpected results); it does not mean do something unpredictable like segfault. (Notice: this is in practical real-world compilers on non-ancient architectures; daydreaming about hypotheticals won’t make them real.) For example, if you compile with pointer tagging on a supporting ARM64 CPU, casting pointers to uintptr_t will strip this tag to give you the expected pointer address, whereas unions may or may not (depending on compiler optimizations and compiler heuristics) strip the tagging, giving you an invalid address that would segfault if you tried to cast it normally to a pointer and access it.

What about reducing memory usage: no need. The code I provided compiles optimally without any struct packing on common 32-bit and 64-bit CPUs. (Infact, struct packing is rarely needed if you organize your structs right.) What of strange/esoteric CPUs? First, let me ask you: do you think the user of your software would like it if the software ran significantly slower just to save a little RAM? I guarantee the answer is, “Of course not!” In general, many strange/esoteric arches handle underaligned data extremely poorly and slowly. Just bite the bullet that there’s nothing you can do to remedy the situation and don’t sweat it.

Do not do CPlus’s solution of manually unraveling the struct into a flat list of fixed width types. This may seem easier conceptually but quickly snowballs into an unwranglable maze of macro #if/#else when you try to mix in pointers and non-fixed-width types. Trust me it pays off to take the time to figure out the optimal way to deeply nest structs/unions to achieve an optimal layout portably. And, it also has the advantage of playing well into compiler tooling so the compiler can warn you about questionable lines of code.

Important side note about packed structs

Beware of unaligned access indexing arrays offset from a packed struct member!: Although not a problem in your sample code as uint16_t only needs 2 byte alignment, passing around a pointer offset from a member array in a packed struct can cause unaligned access, which requires certain special instructions on some architectures (otherwise your code will segfault). For example:

#include <stddef.h>
#include <stdint.h>
#ifdef _MSC_VER
#  pragma pack(push, 1)
#endif
struct buggy {
    size_t len;
    uint64_t arr[1];
}
#ifdef _MSC_VER
#  pragma pack(pop)
#elif defined(__GNUC__)
__attribute__((packed))
#endif
;
//BAD CODE!: returns an unaligned pointer on 32-bit
//  arches that support 64-bit aligned loads
uint64_t *buggy_nth(struct buggy *st, size_t idx) {
    return idx < st->len ? &(st->arr[idx]) : NULL;
} 

Sure enough, GCC warns about the issue. (Not very intelligently though; GCC seems to have a blanket warning for all packed struct array addressing regardless of various obviously safe cases:)

<source>:19:28: warning: taking address of packed member of 'struct buggy' may result in an unaligned pointer value [-Waddress-of-packed-member]
   19 |     return idx < st->len ? &(st->arr[idx]) : NULL;
      |                            ^~~~~~~~~~~~~~~

Notice: you must not rely on the compiler/tooling to warn you. Although it did in this trivial case, it’s quite easy to get into a situation you’re casting between void * and accidentally suppress/obsfucate the compilers’ warnings.

like image 37
Jack G Avatar answered Dec 23 '25 21:12

Jack G



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!