Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct interpretation of clause 6.5 Expressions in the draft C standard

I’m reading draft WG 14/N 3088, paragraph 6.5 Expressions, paragraph 7:

7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:98)

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

Please clarify how exactly the following clause should be interpreted:

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union)

For example, let's say by "aggregate" I mean a regular structure.

  1. It's unclear where in the description of such a structure a member with the appropriate type should be located? Anywhere?
  2. Does this clause mean that if some int IntObject exists in memory, I can access it using a structure S that includes the same int x among its members in its definition, provided that the structure object "overlays" the memory occupied by the IntObject, such that the expression ((S *)some_address)->x refers to the address of the memory location occupied by the IntObject? Or is it something else?

In other words, I understand it like this: there is a structure of the form

typedef struct {
    float f;
    ...
    int x;
} S;

If the address of the IntObject matches the address of &((S *)some_address)->x, access to the IntObject via the expression ((S *)some_address)->x is permitted.

Or is everything wrong?

like image 305
Evgeny Ilyin Avatar asked Nov 01 '25 02:11

Evgeny Ilyin


2 Answers

  1. It's unclear where in the description of such a structure a member with the appropriate type should be located? Anywhere?

By the letter of the spec, yes. Anywhere. But that's not as open as it may seem.

There are two possibilities for accessing the value via an lvalue of structure type. The first is whole structure assignment, such as this:

struct S dest = *(struct S *)p;

where the storage of the int in question overlaps the storage of the object that lvalue *(struct S *)p designates, and struct S has a member of type int. The strict aliasing rule permits that access with respect to the int, but that's moot if it forbids access to the overall structure via that lvalue. For the SAR to allow it, the effective type of the whole region must be a qualified or unqualified version of a type compatible with struct S. (See also below.)

The second possibility is if you consider access via an lvalue of aggregate type to include accesses via expressions that include a member-selection operator on the aggregate, such as this:

int i = ((struct S *)p)->an_int;

, where the storage attributed to member an_int overlaps that of the int in question. I think this is the case the spec is primarily targeted at. Note well here that this also expresses an access via an lvalue of the member's type, so the SAR applies at that level as well.

  1. Does this clause mean that if some int IntObject exists in memory, I can access it using a structure S that includes the same int x among its members in its definition, provided that the structure object "overlays" the memory occupied by the IntObject, such that the expression ((S *)some_address)->x refers to the address of the memory location occupied by the IntObject? Or is it something else?

You're talking here about the second possibility discussed above. The conditions you describe are at least necessary for that kind of access to conform to the strict aliasing rule, but there is some debate over whether they are sufficient. This is exactly the question of whether the evaluating the expression ((struct S *)p)->an_int comprises accessing the value of *(struct S *)p. If it doesn't, then we need only consider the narrower region designated by the overall ((struct S *)p)->an_int. Otherwise, the SAR applies at the whole-structure level too, so SAR conformance requires the overall region to contain an object whose effective type is a qualified or unqualified version of a type compatible with struct S.

To the best of my knowledge, there is no authoritative answer to which of those interpretations applies. Your safest bet is to assume the more conservative interpretation, that the SAR applies at the level of the aggregate, too. And that's not particularly burdensome here, for there is rarely much advantage to writing code that depends on the more liberal interpretation of the spec in this area.

like image 144
John Bollinger Avatar answered Nov 03 '25 10:11

John Bollinger


The specific line in the paragraph you quote tells us that any member of a structure or union may be accessed by using the entire structure or union, not just a pointer to that single member.

Consider this code:

struct S { float f; int i; };
struct S X = { 3.5, 9 };

int foo(struct S *p, int *q)
{
    printf("%d\n", *q);
    *p = X;
    return *q;
}

The compiler loads *q for the printf. For efficiency, the compiler would like to reuse the loaded value for the return value instead of loading it from memory again. Is it allowed to assume *p = X; does not change *q?

No, it cannot, because *q is an int, and there is an int member of *p, so *p = X; could change *q.

Specifically, here is how the paragraph you quote should be interpreted for this:

  • The i member of the struct S that p points to has effective type int.
  • One of the types that “shall” be used to access an object with effective type int is a type compatible with int, and int is compatible with itself, so int may be used to access the i member.
  • Another type that “shall” be used to access an object with effective type int is an aggregate containing int (since it was aforementioned). struct S is an aggregate containing int.

Therefore *p = X; conforms to this rule; it accesses the member i (because it accesses the entire structure) using an aggregate containing int.

like image 45
Eric Postpischil Avatar answered Nov 03 '25 10:11

Eric Postpischil