Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does C strict aliasing make untyped static memory pools impossible?

WG14 member Jens Gustedt says in a blog post on strict aliasing rules:

Character arrays must not be reinterpreted as objects of other types.

Is that, in fact, true? (I guess the corresponding language in the standard is the part saying that if an object has a declared type, then that type is its effective type.) If so, does it mean that an allocator that parcels out memory from a statically declared memory region is unimplementable in standard C?

I know TeX ignores most of Pascal’s type system and treats everything as an array of words because of a similar issue, but I hoped that if I ever found myself in a similar situation in (malloc-less) C, I could just declare a maximally aligned char array and keep using structs the usual way. I also fail to see what the point of _Alignas could possibly be in such a world, except as a standardized device for expressing non-standard requirements (similar to volatile).

like image 816
Alex Shpilkin Avatar asked Jul 22 '21 12:07

Alex Shpilkin


Video Answer


3 Answers

The rules on aliasing are specified in section 6.5p7 of the C standard:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

  1. The intent of this list is to specify those circumstances in which an object may or may not be aliased

Note that this list allows any object to be accessed via a char *, but not the reverse, i.e. an object declared as an array of one or more characters can't be accessed as an lvalue of some other type.

This also means that malloc can't be implemented in a standard compliant way, since there's no way to create memory with no effective type without it. However malloc is considered part of the implementation and therefore can take of advantage of its knowledge of implementation internals to return a pointer to a block of memory that a compliant program can use.

like image 58
dbush Avatar answered Oct 24 '22 02:10

dbush


The wording “Character arrays must not be reinterpreted as objects of other types” is imprecise. A correct statement is that if you reinterpret a character array as an object of another type (except as allowed by C 2018 6.5 7), the C standard does not define the behavior.

As always, if we want to accomplish a task, and the C standard does not define the behavior we want, we can look to other things to define the behavior we want.

If so, does it mean that an allocator that parcels out memory from a statically declared memory region is unimplementable in standard C?

Such an allocator is unimplementable in strictly conforming C, which is C code that does not rely on an unspecified, undefined, or implementation-defined behavior (and does not exceed any minimum implementation limit). It is entirely possible to write such an allocator in conforming C, which is C with extensions. Quite simply, one could put the memory allocation routines in one source file and compile them with a switch that supports aliasing memory as different types. (This is an extension, such as GCC’s -fno-strict-aliasing switch.) Then, when compiling other source files with common compilers, the compiler is blind to the effective type of the memory in the memory allocation source file, so it cannot be affected by the fact that the memory allocation routines use character arrays. (This is another extension, albeit the behavior arises implicitly from our understanding of how compilers and linkers are designed.)

like image 34
Eric Postpischil Avatar answered Oct 24 '22 02:10

Eric Postpischil


WG14 member Jens Gustedt says in a blog post on strict aliasing rules:

Character arrays must not be reinterpreted as objects of other types.

Is that, in fact, true?

Sort of. The language specifications do not actually forbid such reinterpretation via pointer manipulation, but they do specify that accessing a character array or part of one as if it were an object of non-character type produces undefined behavior. If we take avoiding undefined behavior to be of paramount importance then Jens' "must not" follows.

However, "undefined behavior" means that the language specifications do not define the behavior, neither that of the access itself nor that of the entire program that exercises such an access. A program that performs such an access does not conform strictly to the language specifications, but its behavior might nevertheless be perfectly well defined when used with some particular C implementation, perhaps in combination with some other measure, such as specific compilation options. That same program might fail spectacularly -- or very subtly -- with a different C implementation, but that might not be a relevant consideration in some cases.

(I guess the corresponding language in the standard is the part saying that if an object has a declared type, than that type is its effective type.)

Yes.

If so, does it mean that an allocator that parcels out memory from a statically declared memory region is unimplementable in standard C?

As I take you to mean the question, yes. If you declare a large array of some type and hand out pointers (in)to that array, then undefined behavior arises from using those pointers to access regions of the array as if they had types inconsistent with the array's declared type, or where the pointer used for access is not correctly aligned for accessing members of the array as their declared type.

On the other hand, you can write such an allocator to manage access to a pool of a specific type of objects, such that the behavior of accessing allocated objects according to compatible types is well defined.

I hoped that if I ever found myself in a similar situation in (malloc-less) C, I could just declare a maximally aligned char array and keep using structs the usual way.

You might be able to do. That's a question of what your specific C implementation affords, above and beyond the language specifications.

I also fail to see what the point of _Alignas could possibly be in such a world, except as a standardized device for expressing non-standard requirements (similar to volatile).

I take you to be supposing that the role of _Alignas is to ensure correct alignment for pointer-based aliasing. Since such aliased access produces undefined behavior, why should one care about such alignment considerations?

Maybe one shouldn't. Certainly I find _Alignas rarely, if ever, to be of any genuine use in my own programming, and I generally write for hosted environments that, therefore, do provide malloc() and afford objects without declared types. But if you are relying on characteristics of your particular C implementation then you may find that _Alignas serves a useful purpose for you.

like image 1
John Bollinger Avatar answered Oct 24 '22 00:10

John Bollinger