Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a struct alias its own initial and only member?

For example, is this code valid, or does it invoke undefined behavior by violating the aliasing rules?

int x;
struct s { int i; } y;
x = 1;
y = *(struct s *)&x;
printf("%d\n", y.i);

My interest is in using a technique based on this to develop a portable method for performing aliased reads.

Update: here is the intended usage case, a little bit different, but it should be valid if and only if the above is valid:

static inline uint32_t read32(const unsigned char *p)
{
    struct a { char r[4]; };
    union b { struct a r; uint32_t x; } tmp;
    tmp.r = *(struct a *)p;
    return tmp.x;
}

GCC, as desired, compiles this to a single 32-bit load, and it seems to avoid the aliasing issues that could happen if p actually points to a type other than char. In other words, it seems to act as a portable replacement for the GNU C __attribute__((__may_alias__)) attribute. But I'm uncertain whether it's really well-defined...

like image 489
R.. GitHub STOP HELPING ICE Avatar asked Jun 29 '13 21:06

R.. GitHub STOP HELPING ICE


2 Answers

I believe this will still violate effective typing rules. You want to access a memory location that wasn't declared explicitly (or implicitly via storage in case of dynamic allocation) as containing a struct a through an expression of that type.

None of the sections that have been quoted in other answers can be used to escape this basic restriction.

However, I believe there's a solution to your problem: Use __builtin_memcpy(), which is available even in freestanding environments (see the manual entry on -fno-builtin).


Note that the issue is a bit less clear-cut than I make it sound. C11 section 6.5 §7 tells us that it's fine to access an object through an lvalue expression that has an aggregate or union type that includes one of the aforementioned types among its members.

The C99 rationale makes it clear that this restriction is there so a pointer to an aggregate and a pointer to one of its members may alias.

I believe the ability to use this loophole in the way of the first example (but not the second one, assuming p doesn't happen to point to an actual char [4]) is an unintended consequence, which the standard only fails to disallow because of imprecise wording.

Also note that if the first example were valid, we'd basically be able to sneak in structural typing into an otherwise nominally typed language. Structures in a union with common initial subsequence aside (and even then, member names do matter), an identical memory layout is not enough to make types compatible. I believe the same reasoning applies here.

like image 178
Christoph Avatar answered Nov 06 '22 08:11

Christoph


My reading of aliasing rules (C99, 6.5p7) with the presence of this sentence:

"an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or"

leads to me think it does not violate the C aliasing rules.

But the fact it does not violate aliasing rules is not enough for this code snippet to be valid. It may invoked undefined behavior for other reasons.

(struct s *) &x

is not guaranteed to point to a valid struct s object. Even if we assume the alignment of x is suitable for an object of type struct, the resulting pointer after the cast may not point to a space large enough to hold the structure object (as struct s may have padding after its last member).

EDIT: the answer has been completely reworked from its initial version

like image 26
ouah Avatar answered Nov 06 '22 07:11

ouah