Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I initialize a flexible array in rodata and create a pointer to it?

In C, the code

char *c = "Hello world!";

stores Hello world!\0 in rodata and initializes c with a pointer to it. How can I do this with something other than a string?

Specifically, I am trying to define my own string type

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

And then want some sort of macro so that I can say

const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

And have it behave the same, in that \x0c\0\0\0Hello world! is stored in rodata and c2 is initialized with a pointer to it.

I tried using

#define PASCAL_STRING_CONSTANT(c_string_constant) \
    &((const PascalString) { \
        .Length=sizeof(c_string_constant)-1, \
        .Data=(c_string_constant), \
    })

as suggested in these questions, but it doesn't work because Data is a flexible array: I get the error error: non-static initialization of a flexible array member (with gcc, clang gives a similar error).

Is this possible in C? And if so, what would the PASCAL_STRING_CONSTANT macro look like?

To clarify

With a C string, the following code-block never stores the string on the stack:

#include <inttypes.h>
#include <stdio.h>

int main(void) {
    const char *c = "Hello world!";

    printf("test %s", c);

    return 0;
}

As we can see by looking at the assembly, line 5 compiles to just loading a pointer into a register.

I want to be able to get that same behavior with pascal strings, and using GNU extensions it is possible to. The following code also never stores the pascal-string on the stack:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

Looking at its generated assembly, line 18 is also just loading a pointer.

However, the best code I've found to do this in ANSI C produces code to copy the entire string onto the stack:

#include <inttypes.h>
#include <stdio.h>

typedef struct {
   size_t Length;
   char Data[];
} PascalString;

#define PASCAL_STRING_CONSTANT(initial_value) \
    (const PascalString *)&(const struct { \
        uint32_t Length; \
        char Data[sizeof(initial_value)]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }

int main(void) {
    const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");

    printf("test %.*s", c2->Length, c2->Data);

    return 0;
}

In the generated assembly for this code, line 19 copies the entire struct onto the stack then produces a pointer to it.

I'm looking for either ANSI C code that produces the same assembly as my second example, or an explanation of why that's not possible with ANSI C.

like image 407
Gavin S. Yancey Avatar asked Sep 26 '19 01:09

Gavin S. Yancey


2 Answers

You can use this macro, which names the name of the variable on its contents:

#define PASCAL_STRING(name, str) \
    struct { \
        unsigned char len; \
        char content[sizeof(str) - 1]; \
    } name = { sizeof(str) - 1, str }

To create such a string. Use it like this:

const PASCAL_STRING(c2, "Hello world!");
like image 85
dbush Avatar answered Nov 13 '22 17:11

dbush


This can be done with the statment-expressions GNU extension, although it is nonstandard.

#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
        static const PascalString _tmpstr = { \
            .Length=sizeof(c_string_constant)-1, \
            .Data=c_string_constant, \
        }; \
        &_tmpstr; \
    })

The extension allows you to have multiple statements in a block as an expression which evaluates to the value of the last statement by enclosing the block in ({ ... }). Thus, we can declare our PascalString as a static const value, and then return a pointer to it.

For completeness, we can also make a stack buffer if we want to modify it:

#define PASCAL_STRING_STACKBUF(initial_value, capacity) \
    (PascalString *)&(struct { \
        uint32_t Length; \
        char Data[capacity]; \
    }){ \
        .Length = sizeof(initial_value)-1, \
        .Data = initial_value, \
    }
like image 30
Gavin S. Yancey Avatar answered Nov 13 '22 17:11

Gavin S. Yancey