Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract the fields of a C struct

I often have to write code in other languages that interact with C structs. Most typically this involves writing Python code with the struct or ctypes modules.

So I'll have a .h file full of struct definitions, and I have to manually read through them and duplicate those definitions in my Python code. This is time consuming and error-prone, and it's difficult to keep the two definitions in sync when they change frequently.

Is there some tool or library in any language (doesn't have to be C or Python) which can take a .h file and produce a structured list of its structs and their fields? I'd love to be able to write a script to generate my automatically generate my struct definitions in Python, and I don't want to have to process arbitrary C code to do it. Regular expressions would work great about 90% of the time and then cause endless headaches for the remaining 10%.

like image 238
Eli Courtwright Avatar asked Jan 12 '10 16:01

Eli Courtwright


4 Answers

If you compile your C code with debugging (-g), pahole (git) can give you the exact structure layouts being used.

$ pahole /bin/dd
…
struct option {
        const char  *              name;                 /*     0     8 */
        int                        has_arg;              /*     8     4 */

        /* XXX 4 bytes hole, try to pack */

        int *                      flag;                 /*    16     8 */
        int                        val;                  /*    24     4 */

        /* size: 32, cachelines: 1, members: 4 */
        /* sum members: 24, holes: 1, sum holes: 4 */
        /* padding: 4 */
        /* last cacheline: 32 bytes */
};
…

This should be quite a lot nicer to parse than straight C.

like image 62
ephemient Avatar answered Nov 07 '22 19:11

ephemient


Regular expressions would work great about 90% of the time and then cause endless headaches for the remaining 10%.

The headaches happen in the cases where the C code contains syntax that you didn't think of when writing your regular expressions. Then you go back and realise that C can't really be parsed by regular expressions, and life becomes not fun.

Try turning it around: define your own simple format, which allows less tricks than C does, and generate both the C header file and the Python interface code from your file:

define socketopts
    int16 port
    int32 ipv4address
    int32 flags

Then you can easily write some Python to convert this to:

typedef struct {
    short port;
    int ipv4address;
    int flags;
} socketopts;

and also to emit a Python class which uses struct to pack/unpack three values (possibly two of them big-endian and the other native-endian, up to you).

like image 28
Steve Jessop Avatar answered Nov 07 '22 18:11

Steve Jessop


Have a look at Swig or SIP that would generate interface code for you or use ctypes.

like image 3
Gregory Pakosz Avatar answered Nov 07 '22 17:11

Gregory Pakosz


Have you looked at Swig?

like image 2
mmr Avatar answered Nov 07 '22 18:11

mmr