I recently came across a colleague's code that looked like this: <pre class="prettyprint"><code>typedef struct A { int x; }A; typedef struct B { A a; int d; }B; void fn(){ B *b; ((A*)b)->x = 10; } </code></pre> His explanation was that since <code>struct A</code> was the first member of <code>struct B</code>, so <code>b->x</code> would be the same as <code>b->a.x</code> and provides better readability. This makes sense, but is this considered good practice? And will this work across platforms? Currently this runs fine on GCC.

Yes, it will work cross-platform(a), but that doesn't necessarily make it a good idea. As per the ISO C standard (all citations below are from C11), <code>6.7.2.1 Structure and union specifiers /15</code>, there is not allowed to be padding before the first element of a structure In addition, <code>6.2.7 Compatible type and composite type</code> states that: <blockquote> Two types have compatible type if their types are the same </blockquote> and it is undisputed that the <code>A</code> and <code>A-within-B</code> types are identical. This means that the memory accesses to the <code>A</code> fields will be the same in both <code>A</code> and <code>B</code> types, as would the more sensible <code>b->a.x</code> which is probably what you should be using if you have any concerns about maintainability in future. And, though you would normally have to worry about strict type aliasing, I don't believe that applies here. It is illegal to alias pointers but the standard has specific exceptions. <code>6.5 Expressions /7</code> states some of those exceptions, with the footnote: <blockquote> The intent of this list is to specify those circumstances in which an object may or may not be aliased. </blockquote> The exceptions listed are: <ul> <li> <code>a type compatible with the effective type of the object</code>;</li> <li>some other exceptions which need not concern us here; and</li> <li> <code>an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union)</code>.</li> </ul> That, combined with the struct padding rules mentioned above, including the phrase: <blockquote> A pointer to a structure object, suitably converted, points to its initial member </blockquote> seems to indicate this example is specifically allowed for. The core point we have to remember here is that the type of the expression <code> ((A*)b)</code> is <code>A*</code>, not <code>B*</code>. That makes the variables compatible for the purposes of unrestricted aliasing. That's my reading of the relevant portions of the standard, I've been wrong before (b), but I doubt it in this case. So, if you have a genuine need for this, it will work okay but I'd be documenting any constraints in the code very close to the structures so as to not get bitten in future. <hr> (a) In the general sense. Of course, the code snippet: <pre class="prettyprint"><code>B *b; ((A*)b)->x = 10; </code></pre> will be undefined behaviour because <code>b</code> is not initialised to something sensible. But I'm going to assume this is just example code meant to illustrate your question. If anyone's concerned about it, think of it instead as: <pre class="prettyprint"><code>B b, *pb = &b; ((A*)pb)->x = 10; </code></pre> <hr> (b) As my wife will tell you, frequently and with little prompting :-)

Extending a struct in C

Tags:

c

struct

I recently came across a colleague's code that looked like this:

typedef struct A {   int x; }A;  typedef struct B {   A a;   int d; }B;  void fn(){   B *b;   ((A*)b)->x = 10; }

His explanation was that since struct A was the first member of struct B, so b->x would be the same as b->a.x and provides better readability.
This makes sense, but is this considered good practice? And will this work across platforms? Currently this runs fine on GCC.

884

asked Mar 06 '14 08:03

rubndsouza

2 Answers

Yes, it will work cross-platform^(a), but that doesn't necessarily make it a good idea.

As per the ISO C standard (all citations below are from C11), 6.7.2.1 Structure and union specifiers /15, there is not allowed to be padding before the first element of a structure

In addition, 6.2.7 Compatible type and composite type states that:

Two types have compatible type if their types are the same

and it is undisputed that the A and A-within-B types are identical.

This means that the memory accesses to the A fields will be the same in both A and B types, as would the more sensible b->a.x which is probably what you should be using if you have any concerns about maintainability in future.

And, though you would normally have to worry about strict type aliasing, I don't believe that applies here. It is illegal to alias pointers but the standard has specific exceptions.

6.5 Expressions /7 states some of those exceptions, with the footnote:

The intent of this list is to specify those circumstances in which an object may or may not be aliased.

The exceptions listed are:

a type compatible with the effective type of the object;
some other exceptions which need not concern us here; and
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union).

That, combined with the struct padding rules mentioned above, including the phrase:

A pointer to a structure object, suitably converted, points to its initial member

seems to indicate this example is specifically allowed for. The core point we have to remember here is that the type of the expression ((A*)b) is A*, not B*. That makes the variables compatible for the purposes of unrestricted aliasing.

That's my reading of the relevant portions of the standard, I've been wrong before ^(b), but I doubt it in this case.

So, if you have a genuine need for this, it will work okay but I'd be documenting any constraints in the code very close to the structures so as to not get bitten in future.

^(a) In the general sense. Of course, the code snippet:

B *b; ((A*)b)->x = 10;

will be undefined behaviour because b is not initialised to something sensible. But I'm going to assume this is just example code meant to illustrate your question. If anyone's concerned about it, think of it instead as:

B b, *pb = &b; ((A*)pb)->x = 10;

^(b) As my wife will tell you, frequently and with little prompting :-)

196

answered Sep 18 '22 20:09

paxdiablo

I'll go out on a limb and oppose @paxdiablo on this one: I think it's a fine idea, and it's very common in large, production-quality code.

It's basically the most obvious and nice way to implement inheritance-based object oriented data structures in C. Starting the declaration of struct B with an instance of struct A means "B is a sub-class of A". The fact that the first structure member is guaranteed to be 0 bytes from the start of the structure is what makes it work safely, and it's borderline beautiful in my opinion.

It's widely used and deployed in code based on the GObject library, such as the GTK+ user interface toolkit and the GNOME desktop environment.

Of course, it requires you to "know what you're doing", but that is generally always the case when implementing complicated type relationships in C. :)

In the case of GObject and GTK+, there's plenty of support infrastructure and documentation to help with this: it's quite hard to forget about it. It might mean that creating a new class isn't something you do just as quickly as in C++, but that's perhaps to be expected since there's no native support in C for classes.

answered Sep 19 '22 20:09

unwind

Related questions
                            
                                Can I share a file descriptor to another process on linux or are they local to the process?
                            
                                What is the use of the `inline` keyword in C?
                            
                                How to get the real and total length of char * (char array)?
                            
                                Why do we cast sockaddr_in to sockaddr when calling bind()?
                            
                                Why do round() and ceil() not return an integer?
                            
                                How to trigger SIGUSR1 and SIGUSR2?
                            
                                How to access a local variable from a different function using pointers?
                            
                                How to create a single instance application in C or C++
                            
                                CMAKE - How to properly copy static library's header file into /usr/include?
                            
                                Getting a weird percent sign in printf output in terminal with C
                            
                                makefiles - compile all c files at once
                            
                                strdup or _strdup?
                            
                                How was the first C compiler written?
                            
                                What is the difference between static_cast and reinterpret_cast? [duplicate]
                            
                                Why aren't my compile guards preventing multiple definition inclusions?
                            
                                Hexadecimal string to byte array in C
                            
                                Will printf still have a cost even if I redirect output to /dev/null?
                            
                                List of all users and groups
                            
                                Is unevaluated division by 0 undefined behavior?
                            
                                In C, how would I choose whether to return a struct or a pointer to a struct?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With