Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Undefined behavior with type casting?

Take the following example:

typedef struct array_struct {
    unsigned char* pointer;
    size_t length;
} array;

typedef struct vector_struct {
    unsigned char* pointer;
    // Reserved is the amount of allocated memory not being used.
    // MemoryLength = length + reserved;
    size_t length, reserved;
} vector;


// Example Usage:
vector* vct = (vector*) calloc(sizeof(vector), 1);
vct->reserved = 0;
vct->length = 24;
vct->pointer = (unsigned char*) calloc(arr->length, 1);

array* arr = (array*) vct;
printf("%i", arr->length);
free(arr->pointer);
free(arr);

C seems to allocate memory for struct members in the order they're defined in the struct. Which means that if you cast vector -> array you'll still get the same results if you perform operations on array as you would as if you did it on vector since they have the same members and order of members.

As long as you only down cast from vector -> array as if array was a generic type for vector you shouldn't run into any problems.

Is this undefined and bad behavior despite the similar structure of the types?

like image 529
FatalSleep Avatar asked Jun 04 '16 15:06

FatalSleep


People also ask

What causes undefined Behaviour?

Modifying an object between two sequence points more than once produces undefined behavior.

What happens when you typecast a pointer?

Typecasting change a variable into a different type just for that operation. Pointer type determines the size of the data it points to. In other words, when you do pointer arithemetic (i.e +), the number of bytes change (i.e increase) in terms of memory address is determined by the pointer type.


1 Answers

This is well-defined behavior if you permit type aliasing (which C doesn't but most compilers do, either by default or by some compilation flag), and it is undefined behavior if you prohibit this type of type aliasing (which is commonly referred to as "strict aliasing" because the rules are pretty strict). From the N1570 draft of the C standard:

6.5.2.3

6 One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the complete type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.

That section is about unions, but in order for that behavior to be legal in unions, it restricts padding possibilities and thus requires the two structures to share a common layout and initial padding. So we've got that going for us.

Now, for strict aliasing, the standard says:

6.5

7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object
  • [...]

A "compatible type" is:

6.2.7

1 Two types have compatible type if their types are the same.

It goes on to explain that more and list a few cases that have a little more "wiggle room" but none of them apply here. Unfortunately for you, the buck stops here. This is undefined behavior.

Now, one thing you could do to get around this would be:

typedef struct array_struct {
    unsigned char* pointer;
    size_t length;
} array;

typedef struct vector_struct {
    array array;
    size_t reserved;
} vector;
like image 153
Cornstalks Avatar answered Sep 21 '22 14:09

Cornstalks