Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting the offset of a variable inside a struct is based on the NULL pointer, but why?

Tags:

c

I found a trick on a youtube video explaining how you can get the offset of a struct member by using a NULL pointer. I understand the code snippit below (the casts, the ampersand, and so on), but I do not understand why this works with the NULL pointer. I thought that the NULL pointer could not point to anything. So I cannot mentally visualize how it works. Second, the NULL pointer is not always represented by the compiler as being 0, somtimes it is a non-zero value. But than how could this piece of code work correctly ? Or wouldn't it work correctly anymore ?

#include <stdio.h>

int main(void)
{
    /* Getting the offset of a variable inside a struct */
    typedef struct {
        int a;
        char b[23];
        float c;
    } MyStructType;

    unsigned offset = (unsigned)(&((MyStructType * )NULL)->c);

    printf("offset = %u\n", offset);

    return 0;
}
like image 798
Meerkat Avatar asked Aug 08 '19 12:08

Meerkat


People also ask

What is offset in structure?

Description. The C library macro offsetof(type, member-designator) results in a constant integer of type size_t which is the offset in bytes of a structure member from the beginning of the structure. The member is given by member-designator, and the name of the structure is given in type.

Can you free a null pointer?

It is safe to free a null pointer. The C Standard specifies that free(NULL) has no effect: The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.


2 Answers

Note that this code is actually undefined behaviour. Dereferencing a NULL pointer is never allowed, even if no value is accessed, only the address (this was a root cause for a linux kernel exploit)

Use offsetof instead for a save alternative.


As to why it seems works with a NULL pointer: it assumes that NULL is 0. Basically you could use any pointer and calculate:

MyStructType t; 
unsigned off = (unsigned)(&(&t)->c) - (unsigned)&t;

if &t == 0, this becomes:

 unsigned off = (unsigned)(&(0)->c) - 0;

Substracting 0 is a no-op

like image 35
king_nak Avatar answered Nov 12 '22 17:11

king_nak


I found a trick on a youtube video explaining how you can get the offset of a struct member by using a NULL pointer.

Well, at least you came here to ask about the random Internet advice you turned up. We're an Internet resource ourselves, of course, but I like to think that our structure and reputation gives you a basis for estimating the reliability of what we have to say.

I understand the code snippit below (the casts, the ampersand, and so on), but I do not understand why this works with the NULL pointer. I thought that the NULL pointer could not point to anything.

Yes, from the perspective of C semantics, a null pointer definitely does not point to anything, and NULL is a null pointer constant.

So I cannot mentally visualize how it works.

The (flawed) idea is that

  • NULL is equivalent to a pointer to address 0 in a flat address space (unsafe assumption);
  • ((MyStructType * )NULL)->c designates the member c of an altogether hypothetical object of type MyStructType residing at that address (not supported by the standard);
  • applying the & operator yields the address that such a member would have if it in fact existed (not supported by the standard); and
  • converting the resulting address to an integer yields an address in the assumed flat address space, expressed in units the size of a C char (in no way guaranteed);
  • so that the resulting integer simultaneously represents both an absolute address and an offset (follows from the previous assumptions, because the supposed base address of the hypothetical structure is 0).

Second, the NULL pointer is not always represented by the compiler as being 0, somtimes it is a non-zero value.

Quite right, that is one of the flaws in the scheme presented.

But than how could this piece of code work correctly ? Or wouldn't it work correctly anymore ?

Although the Standard provides no basis to justify relying on the code to behave as advertised, that does not mean that it must necessarily fail. C implementations do need to be internally consistent about how they represent null pointers, and -- to a certain degree -- about how they convert between pointers and integer. It turns out to be fairly common that the code's assumptions about those things are in fact satisfied by implementations.

So in practice, the code does work with many C implementations. But it systematically produces the wrong answer with some others, and there may be some in which it produces the right answer some appreciable fraction of the time, but the wrong answer the rest of the time.

like image 141
John Bollinger Avatar answered Nov 12 '22 18:11

John Bollinger