Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I print the offset of a struct member at compile time?

Tags:

c

gcc

Given a struct, for instance:

struct A {
    char a;
    char b;
} __attribute__((packed));

I want the offset of b (in this example, 1) in the struct to be printed at compile time - I don't want to have to run the program and call something like printf("%zu", offsetof(struct A, b)); because printing is non-trivial on my platform. I want the offset to be printed by the compiler itself, something like:

> gcc main.c
The offset of b is 1

I've tried a few approaches using #pragma message and offsetof, with my closest being:

#define OFFSET offsetof(struct A, b)
#define XSTR(x) STR(x)
#define STR(x) #x

#pragma message "Offset: " XSTR(OFFSET)

which just prints:

> gcc main.c
main.c:12:9: note: #pragma message: Offset: __builtin_offsetof (struct A, b)

which does not print the numeric offset. It's possible to binary-search the offset at compile time by using _Static_assert - but my real structs are big and this can get a bit cumbersome.

like image 763
Daniel Kleinstein Avatar asked Sep 16 '21 12:09

Daniel Kleinstein


4 Answers

I suspect the stated constraint “I want the offset to be printed by the compiler itself” is an XY problem and that we merely need the offset to be printed by the build tools on the system used for building, not specifically by the compiler.

In this case, GCC and Clang have the ability to include arbitrary text in their assembly output and to include various data operands in that text, including immediate values for structure offsets.

Inside any function, include these lines:

#if GenerateStructureOffsets
    __asm__("# offsetof(struct A, b) = %c0" : : "i" (offsetof(struct A, b)));
#endif

Then compile with the switches -DGenerateStructureOffsets and -S. The compiler will generate a file named SourceFileName.s, and you can use -o Name to give it a different name if desired.

Then grep "## offsetof" Name will find this line, showing something like:

    ## offsetof(struct A, b) = 1

Then you can use sed or other tools to extract the value.

In the __asm__, "i" says to generate an “immediate” operand. The (offsetof(struct A, b)) that follows that gives the value it should have. In the first quoted string, %c0 is replaced with the value of that operand.

The 0 indicates which operand to replace—if there were more than one listed later in the __asm__, they are numbered 0, 1, 2, 3, and so on. (There is also a mechanism for naming them instead of numbering them, not shown here.) Normally, %0 would be replaced by the form of immediate operand suitable for the target assembly language, such as $1 or #1. However, the c modifier says to use the bare constant, so the replacement text is just the value, in this case 1.

like image 148
Eric Postpischil Avatar answered Oct 23 '22 11:10

Eric Postpischil


Given this macro:

#define PRINT_OFFSETOF(A, B) char (*__daniel_kleinstein_is_cool)[sizeof(char[offsetof(A, B)])] = 1

Use it into your main() function (or whatever function):

struct Test {
  char x;
  long long y;
  int z;
};

int main(void) {
  PRINT_OFFSETOF(struct Test, z);
  return 0;
}

And you will get this warning:

warning: initialization of ‘char (*)[16]’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]

Hence, offsetof(struct Test, z) == 16.


NOTE: in case offsetof() returns 0 (e.g.: PRINT_OFFSETOF(struct Test, x)), the compiler warning will have char (*)[] instead of char (*)[16].

NOTE 2: I only tested this with GCC.

like image 25
Luca Polito Avatar answered Oct 23 '22 10:10

Luca Polito


Surprisingly, looks like __builtin_choose_expr works inside __deprecated__ function attribute. The following program:

#include <stddef.h>

struct A {
    char a;
    char b;
} __attribute__((packed));

#define printval_case(x, xstr, y, ...)  __builtin_choose_expr(x == y, xstr"="#y, __VA_ARGS__)
#define printval(x) do { \
    __attribute__((__deprecated__( \
        printval_case(x, #x, 0, \
        printval_case(x, #x, 1, \
        printval_case(x, #x, 2, \
        printval_case(x, #x, 3, \
        /* etc... */ \
        (void)0 )))) \
    ))) void printval() {} \
    printval(); \
} while (0)

int main() {
    printval(offsetof(struct A, a));
    printval(offsetof(struct A, b));
}

When compiled, then gcc will output:

<source>:23:30: warning: 'printval' is deprecated: offsetof(struct A, a)=0 [-Wdeprecated-declarations]
<source>:24:30: warning: 'printval' is deprecated: offsetof(struct A, b)=1 [-Wdeprecated-declarations]

In a similar fashion you could embed the value into the executable, (similarly to how CMake detects compiler stuff):

#include <stddef.h>
struct A {
    char a;
    char b;
} __attribute__((packed));
#define printval_case(x, xstr, y, ...)  __builtin_choose_expr(x == y, xstr"="#y, __VA_ARGS__)
#define embedval(x) do { \
    static const __attribute__((__used__)) const char unused[] = \
        printval_case(x, #x, 0, \
        printval_case(x, #x, 1, \
        printval_case(x, #x, 2, \
        printval_case(x, #x, 3, \
        /* etc... */ \
        (void)0 )))); \
} while (0)
int main() {
    embedval(offsetof(struct A, a));
    embedval(offsetof(struct A, b));
}

then:

$ gcc file.c && strings ./a.out | grep offsetof
offsetof(struct A, b)=1
offsetof(struct A, a)=0
like image 8
KamilCuk Avatar answered Oct 23 '22 10:10

KamilCuk


Perhaps a new pre-processing step would be acceptable. This could then be done as a separate step that won't affect your production binary.

offsetdumper.sh

#!/bin/bash
#
# pre-process some source file(s), add a macro + main() and a file with rules
# describing the interesting symbos. Compile and run the result.

dumprulefile="$1"
shift

# Define your own macros, like OFFSET, in the "Here Document" below:
{
gcc -E "$@" && cat<<EOF
#define OFFSET(x,y) do { printf("%s::%s %zu\n", #x, #y, offsetof(x,y)); } while(0)
#include <stddef.h>
#include <stdio.h>
int main() {
EOF
cat "$dumprulefile"
echo '}'
} | g++ -x c - && ./a.out

rules

OFFSET(A,a);
OFFSET(A,b);

source.h

typedef struct {
    char a;
    char b;
} __attribute__((packed)) A;

Example:

$ ./offsetdumper.sh rules *.h
A::a 0
A::b 1

This is a bit fragile and won't work if your source.h includes a main function, so it may need some tinkering to fulfill your needs.

like image 2
Ted Lyngmo Avatar answered Oct 23 '22 10:10

Ted Lyngmo