Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does this avoid UB

This question is more of an academic one, seeing as there is no valid reason to write your own offsetof macro anymore. Nevertheless, I've seen this home-grown implementation pop-up here and there:

#define offsetof(s, m) ((size_t) &(((s *)0)->m))

Which is, technically speaking, dereferencing a NULL pointer (AFAIKT):

C11(ISO/IEC 9899:201x) §6.3.2.3 Pointers Section 3

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant

So the above implementation is, according to how I read the standard, the same as writing:

#define offsetof(s, m) ((size_t) &(((s *)NULL)->m))

It does make me wonder that, by changing one tiny detail, the following definition of offsetof would be completely legal, and reliable:

#define offsetof(s, m) (((size_t)&(((s *) 1)->m)) - 1)

Seeing as, instead of 0, 1 is used as a pointer, and I subtract 1 at the end, the result should be the same. I'm no longer using a NULL pointer. As far as I can tell the results are the same.

So basically: is there any reason why using 1 instead of 0 in this offsetof definition might not work? Can it still cause UB in certain cases, and if so: when and how? Basically, what I'm asking here is: Am I missing anything here?

like image 328
Elias Van Ootegem Avatar asked Apr 22 '15 08:04

Elias Van Ootegem


3 Answers

Both definitions are undefined behavior: in the first definition a null pointer is dereferenced and in your second definition you are dereferencing an invalid pointer (the pointer does not point to a valid object). It is not possible in C to write a portable version of offsetof macro.

Defect Report #44 says:

"In particular, this is why the offsetof macro exists: there was otherwise no portable means to compute such translation-time constants."

(DR#44 is for C89 but nothing has changed in the language in C99 and C11 that would allow a portable implementation.)

like image 146
ouah Avatar answered Sep 20 '22 11:09

ouah


I believe the behaviour is implementation-defined. In 6.3.2.3 of n1256:

5 An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.

like image 29
user4098326 Avatar answered Sep 17 '22 11:09

user4098326


One problem is that your created pointer does not point to an object.

6.2.4 Storage durations of objects

  1. The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, 33) and retains its last-stored value throughout its lifetime. 34) If an object is referred to outside of its lifetime, the behavior is undefined. The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.

and

J.2 Undefined behaviour
- The value of a pointer to an object whose lifetime has ended is used (6.2.4).

3.19.2 indeterminate value: either an unspecified value or a trap representation

When you convert 1 to a pointer, and the created pointer does not point to an object, the value of the pointer becomes indeterminate. You then use the pointer. Both of those cause undefined behavior.

The conversion of an integer to a pointer is also problematic:

6.3.2.3 Pointers

  1. An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation. 67)
like image 27
2501 Avatar answered Sep 16 '22 11:09

2501