Although the size of a zero-length array is zero, an array member of this kind may increase the size of the enclosing type as a result of tail padding. The offset of a zero-length array member from the beginning of the enclosing structure is the same as the offset of an array with one or more elements of the same type.
In C programming language, the variables should be declared before a value is assigned to it. In an array, if fewer elements are used than the specified size of the array, then the remaining elements will be set by default to 0. Let us see another example to illustrate this.
It means an array with zero length. – JJJ. Dec 27, 2017 at 13:12. It is initialized as array but has an object structure.
It will create an empty array object. This is still a perfectly valid object - and one which takes up a non-zero amount of space in memory. It will still know its own type, and the count - it just won't have any elements.
This is a way to have variable sizes of data, without having to call malloc
(kmalloc
in this case) twice. You would use it like this:
struct bts_action *var = kmalloc(sizeof(*var) + extra, GFP_KERNEL);
This used to be not standard and was considered a hack (as Aniket said), but it was standardized in C99. The standard format for it now is:
struct bts_action {
u16 type;
u16 size;
u8 data[];
} __attribute__ ((packed)); /* Note: the __attribute__ is irrelevant here */
Note that you don't mention any size for the data
field. Note also that this special variable can only come at the end of the struct.
In C99, this matter is explained in 6.7.2.1.16 (emphasis mine):
As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.
Or in other words, if you have:
struct something
{
/* other variables */
char data[];
}
struct something *var = malloc(sizeof(*var) + extra);
You can access var->data
with indices in [0, extra)
. Note that sizeof(struct something)
will only give the size accounting for the other variables, i.e. gives data
a size of 0.
It may be interesting also to note how the standard actually gives examples of malloc
ing such a construct (6.7.2.1.17):
struct s { int n; double d[]; };
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
Another interesting note by the standard in the same location is (emphasis mine):
assuming that the call to malloc succeeds, the object pointed to by p behaves, for most purposes, as if p had been declared as:
struct { int n; double d[m]; } *p;
(there are circumstances in which this equivalence is broken; in particular, the offsets of member d might not be the same).
This is a hack actually, for GCC (C90) in fact.
It's also called a struct hack.
So the next time, I would say:
struct bts_action *bts = malloc(sizeof(struct bts_action) + sizeof(char)*100);
It will be equivalent to saying:
struct bts_action{
u16 type;
u16 size;
u8 data[100];
};
And I can create any number of such struct objects.
The idea is to allow for a variable-sized array at the end of the struct. Presumably, bts_action
is some data packet with a fixed-size header (the type
and size
fields), and variable-size data
member. By declaring it as a 0-length array, it can be indexed just as any other array. You'd then allocate a bts_action
struct, of say 1024-byte data
size, like so:
size_t size = 1024;
struct bts_action* action = (struct bts_action*)malloc(sizeof(struct bts_action) + size);
See also: http://c2.com/cgi/wiki?StructHack
The code is not valid C (see this). The Linux kernel is, for obvious reasons, not in the slightest concerned with portability, so it uses plenty of non-standard code.
What they are doing is a GCC non-standard extention with array size 0. A standard compliant program would have written u8 data[];
and it would have meant the very same thing. The authors of the Linux kernel apparently love to make things needlessly complicated and non-standard, if an option to do so reveals itself.
In older C standards, ending a struct with an empty array was known as "the struct hack". Others have already explained its purpose in other answers. The struct hack, in the C90 standard, was undefined behavior and could cause crashes, mainly since a C compiler is free to add any number of padding bytes at the end of the struct. Such padding bytes may collide with the data you tried to "hack" in at the end of the struct.
GCC early on made a non-standard extension to change this from undefined to well-defined behavior. The C99 standard then adapted this concept and any modern C program can therefore use this feature without risk. It is known as flexible array member in C99/C11.
Another usage of zero length array is as a named label inside a struct to assist compile time struct offset check.
Suppose you have some large struct definitions (spans multiple cache lines) that you want to make sure they are aligned to cache line boundary both in the beginning and in the middle where it crosses the boundary.
struct example_large_s
{
u32 first; // align to CL
u32 data;
....
u64 *second; // align to second CL after the first one
....
};
In code you can declare them using GCC extensions like:
__attribute__((aligned(CACHE_LINE_BYTES)))
But you still want to make sure this is enforced in runtime.
ASSERT (offsetof (example_large_s, first) == 0);
ASSERT (offsetof (example_large_s, second) == CACHE_LINE_BYTES);
This would work for a single struct, but it would be hard to cover many structs, each has different member name to be aligned. You would most likely get code like below where you have to find names of the first member of each struct:
assert (offsetof (one_struct, <name_of_first_member>) == 0);
assert (offsetof (one_struct, <name_of_second_member>) == CACHE_LINE_BYTES);
assert (offsetof (another_struct, <name_of_first_member>) == 0);
assert (offsetof (another_struct, <name_of_second_member>) == CACHE_LINE_BYTES);
Instead of going this way, you can declare a zero length array in the struct acting as a named label with a consistent name but does not consume any space.
#define CACHE_LINE_ALIGN_MARK(mark) u8 mark[0] __attribute__((aligned(CACHE_LINE_BYTES)))
struct example_large_s
{
CACHE_LINE_ALIGN_MARK (cacheline0);
u32 first; // align to CL
u32 data;
....
CACHE_LINE_ALIGN_MARK (cacheline1);
u64 *second; // align to second CL after the first one
....
};
Then the runtime assertion code would be much easier to maintain:
assert (offsetof (one_struct, cacheline0) == 0);
assert (offsetof (one_struct, cacheline1) == CACHE_LINE_BYTES);
assert (offsetof (another_struct, cacheline0) == 0);
assert (offsetof (another_struct, cacheline1) == CACHE_LINE_BYTES);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With