In C, the code
char *c = "Hello world!";
stores Hello world!\0
in rodata and initializes c
with a pointer to it.
How can I do this with something other than a string?
Specifically, I am trying to define my own string type
typedef struct {
size_t Length;
char Data[];
} PascalString;
And then want some sort of macro so that I can say
const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");
And have it behave the same, in that \x0c\0\0\0Hello world!
is stored in rodata and c2
is initialized with a pointer to it.
I tried using
#define PASCAL_STRING_CONSTANT(c_string_constant) \
&((const PascalString) { \
.Length=sizeof(c_string_constant)-1, \
.Data=(c_string_constant), \
})
as suggested in these questions, but it doesn't work because Data
is a flexible array: I get the error error: non-static initialization of a flexible array member
(with gcc, clang gives a similar error).
Is this possible in C? And if so, what would the PASCAL_STRING_CONSTANT
macro look like?
To clarify
With a C string, the following code-block never stores the string on the stack:
#include <inttypes.h>
#include <stdio.h>
int main(void) {
const char *c = "Hello world!";
printf("test %s", c);
return 0;
}
As we can see by looking at the assembly, line 5 compiles to just loading a pointer into a register.
I want to be able to get that same behavior with pascal strings, and using GNU extensions it is possible to. The following code also never stores the pascal-string on the stack:
#include <inttypes.h>
#include <stdio.h>
typedef struct {
size_t Length;
char Data[];
} PascalString;
#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
static const PascalString _tmpstr = { \
.Length=sizeof(c_string_constant)-1, \
.Data=c_string_constant, \
}; \
&_tmpstr; \
})
int main(void) {
const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");
printf("test %.*s", c2->Length, c2->Data);
return 0;
}
Looking at its generated assembly, line 18 is also just loading a pointer.
However, the best code I've found to do this in ANSI C produces code to copy the entire string onto the stack:
#include <inttypes.h>
#include <stdio.h>
typedef struct {
size_t Length;
char Data[];
} PascalString;
#define PASCAL_STRING_CONSTANT(initial_value) \
(const PascalString *)&(const struct { \
uint32_t Length; \
char Data[sizeof(initial_value)]; \
}){ \
.Length = sizeof(initial_value)-1, \
.Data = initial_value, \
}
int main(void) {
const PascalString *c2 = PASCAL_STRING_CONSTANT("Hello world!");
printf("test %.*s", c2->Length, c2->Data);
return 0;
}
In the generated assembly for this code, line 19 copies the entire struct onto the stack then produces a pointer to it.
I'm looking for either ANSI C code that produces the same assembly as my second example, or an explanation of why that's not possible with ANSI C.
You can use this macro, which names the name of the variable on its contents:
#define PASCAL_STRING(name, str) \
struct { \
unsigned char len; \
char content[sizeof(str) - 1]; \
} name = { sizeof(str) - 1, str }
To create such a string. Use it like this:
const PASCAL_STRING(c2, "Hello world!");
This can be done with the statment-expressions GNU extension, although it is nonstandard.
#define PASCAL_STRING_CONSTANT(c_string_constant) ({\
static const PascalString _tmpstr = { \
.Length=sizeof(c_string_constant)-1, \
.Data=c_string_constant, \
}; \
&_tmpstr; \
})
The extension allows you to have multiple statements in a block as an expression which evaluates to the value of the last statement by enclosing the block in ({ ... })
. Thus, we can declare our PascalString
as a static const
value, and then return a pointer to it.
For completeness, we can also make a stack buffer if we want to modify it:
#define PASCAL_STRING_STACKBUF(initial_value, capacity) \
(PascalString *)&(struct { \
uint32_t Length; \
char Data[capacity]; \
}){ \
.Length = sizeof(initial_value)-1, \
.Data = initial_value, \
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With