Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it OK to access past the size of a structure via member address, with enough space allocated?

Tags:

Specifically, is the following code, the line below the marker, OK?

struct S{     int a; };  #include <stdlib.h>  int main(){     struct S *p;     p = malloc(sizeof(struct S) + 1000);     // This line:     *(&(p->a) + 1) = 0; } 

People have argued here, but no one has given a convincing explanation or reference.

Their arguments are on a slightly different base, yet essentially the same

typedef struct _pack{     int64_t c; } pack;  int main(){     pack *p;     char str[9] = "aaaaaaaa"; // Input     size_t len = offsetof(pack, c) + (strlen(str) + 1);     p = malloc(len);     // This line, with similar intention:     strcpy((char*)&(p->c), str); //                ^^^^^^^ 
like image 520
iBug Avatar asked Nov 10 '17 13:11

iBug


People also ask

How much size does struct take?

In 32 bit processor, it can access 4 bytes at a time which means word size is 4 bytes. Similarly in a 64 bit processor, it can access 8 bytes at a time which means word size is 8 bytes. Structure padding is used to save number of CPU cycles.

What is the relationship between the result of sizeof () for a structure variable and the sum of the sizeof () calls across the same structure's members?

The result of the sizeof operand applied to a structure object can be equal to the sum of sizeof applied to each member separately.

Does order of struct members matter?

The order of fields in a struct does matter - the compiler is not allowed to reorder fields, so the size of the struct may change as the result of adding some padding. The struct must have at least one member in addition to the flexible one.

Can you free a struct?

tl;dr: Just free the struct and you'll be fine. Don't call free on arrays; only call it on dynamically allocated memory. Also, Variables are usually not allocated on the heap, that's why free(&x) might already fail.


1 Answers

The intent at least since the standardization of C in 1989 has been that implementations are allowed to check array bounds for array accesses.

The member p->a is an object of type int. C11 6.5.6p7 says that

7 For the purposes of [additive operators] a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

Thus

&(p->a) 

is a pointer to an int; but it is also as if it were a pointer to the first element of an array of length 1, with int as the object type.

Now 6.5.6p8 allows one to calculate &(p->a) + 1 which is a pointer to just past the end of the array, so there is no undefined behaviour. However, the dereference of such a pointer is invalid. From Appendix J.2 where it is spelt out, the behaviour is undefined when:

Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary * operator that is evaluated (6.5.6).

In the expression above, there is only one array, the one (as if) with exactly 1 element. If &(p->a) + 1 is dereferenced, the array with length 1 is accessed out of bounds and undefined behaviour occurs, i.e.

behavior [...], for which [The C11] Standard imposes no requirements

With the note saying that:

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

That the most common behaviour is ignoring the situation completely, i.e. behaving as if the pointer referenced the memory location just after, doesn't mean that other kind of behaviour wouldn't be acceptable from the standard's point of view - the standard allows every imaginable and unimaginable outcome.


There has been claims that the C11 standard text has been written vaguely, and the intention of the committee should be that this indeed be allowed, and previously it would have been alright. It is not true. Read the part from the committee response to [Defect Report #017 dated 10 Dec 1992 to C89].

Question 16

[...]

Response

For an array of arrays, the permitted pointer arithmetic in subclause 6.3.6, page 47, lines 12-40 is to be understood by interpreting the use of the word object as denoting the specific object determined directly by the pointer's type and value, not other objects related to that one by contiguity. Therefore, if an expression exceeds these permissions, the behavior is undefined. For example, the following code has undefined behavior:

 int a[4][5];   a[1][7] = 0; /* undefined */  

Some conforming implementations may choose to diagnose an array bounds violation, while others may choose to interpret such attempted accesses successfully with the obvious extended semantics.

(bolded emphasis mine)

There is no reason why the same wouldn't be transferred to scalar members of structures, especially when 6.5.6p7 says that a pointer to them should be considered to behave the same as a pointer to the first element of an array of length one with the type of the object as its element type.

If you want to address the consecutive structs, you can always take the pointer to the first member and cast that as the pointer to the struct and advance that instead:

*(int *)((S *)&(p->a) + 1) = 0;