Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Struct alignment safe use

I'm new in a company where the following use of a struct is done:

#include <stdio.h>
#include <string.h>

typedef unsigned char uint8;
typedef signed char int8;
typedef unsigned short int uint16;
typedef signed short int int16;

typedef struct padded_t {
    int8 array0[11];
    int8 array1[4];
    uint16 len2;
    int8 array2[25];
    uint16 len3;
    int8 array3[6];
    // many more len/arrays follow
} padded_t;

int main(int argc, char** argv)
{
    padded_t foo;
    memset((void*)&foo, 0, sizeof(padded_t));

    int8* str = "foobar";
    int16 size = (int16)strlen(str);

    int8* ptr = (int8*)&foo.len2;

    // please note that the memcpy references only the pointer to len
    // are the following couple of lines safe?
    memcpy ((void*)ptr, &size, 2);
    memcpy ((void*)&ptr[2], (void*)str, size);

    printf("%d\n", foo.len2);
    printf("%s\n", foo.array2);

    return 0;
}

I know some things about alignment and padding, and I assume the compiler (gnu C99 for an ARM9 device) will add some paddings to make the struct aligned.

But is this code safe?
As I understand it, it will be safe as long as the uint16 len variables are immediately followed by int8 array[] variables regardless of the other struct members.

Will it only add paddings before a uint16 when the size before it is odd?
Is this use correct? And more importantly, is it safe?

like image 332
learning_frog Avatar asked Aug 10 '16 17:08

learning_frog


2 Answers

You don't need to write code that works on every system. What you should do is write code that has predictable behavior. Either it works as designed or if your assumptions don't hold, a static assertion aborts the compilation.

The second memcpy line fails to do that. It assumes that offsetof(struct padded_t, len2) + 2 == offsetof(struct padded_t, array2). An assumption which will often hold, but is utterly stupid.

Why not just write

foo.len2 = strlen(str);
memcpy (foo.array2, str, foo.len2);
//possibly, foo.array2[foo.len2] = '\0';

Code is readable. No unneeded variables. No unnecessary casts. No unpredictable behavior. The original doesn't look like code you will learn a lot from, the clutter doesn't look like something I expect someone proficient in C to write.

To answer a comment of yours, making them packed is the wrong fix. Because it will misalign members and will just open another can of worms.

Also keep wary of custom fixed-size typedefs. I once had the fun of debugging a typedef char int8; on a system that had chars unsigned

like image 86
a3f Avatar answered Oct 17 '22 17:10

a3f


But is this code safe?

As I understand it, it will be safe as long as the uint16 len variables are immediately followed by int8 array[] variables regardless of the other struct members.

It is not safe in the sense that compilers are allowed free rein to insert any amount of padding between or after structure members, so you cannot be confident that &ptr[2] points to the first byte of foo.array2. Provided that uint16 is indeed two bytes wide, however, (which is in no way guaranteed by the language) you can be confident that if size is less than 25 then the

memcpy ((void*)&ptr[2], (void*)str, size);

will modify neither any bytes of foo.len2 nor the last byte of foo.array2. Since foo was earlier filled with zeroes, this will leave foo.array2 as a properly terminated C string. It is thus safe to print it with printf(). On the other hand, C does not guarantee that the result of doing so will be the same as the result of printing str.

Will it only add paddings before a uint16 when the size before it is odd?

This is at the discretion of the compiler. It might be influenced by pragmas, command-line options, configuration options, language extensions (though none of these are in use in the example), target architecture, or anything else that the compiler wants to use to make such decisions.

Is this use correct?

As far as I can tell, the program is conforming, if that's what you mean.

And more importantly, is it safe?

The program's output is not predictable from its code alone, so in that sense, no, it is not safe.

like image 26
John Bollinger Avatar answered Oct 17 '22 18:10

John Bollinger