Why is gcc allowed to speculatively load from a struct?

Example Showing the gcc Optimization and User Code that May Fault

The function 'foo' in the snippet below will load only one of the struct members A or B; well at least that is the intention of the unoptimized code.

typedef struct {   int A;   int B; } Pair;  int foo(const Pair *P, int c) {   int x;   if (c)     x = P->A;   else     x = P->B;   return c/102 + x; }

Here is what gcc -O3 gives:

mov eax, esi mov edx, -1600085855 test esi, esi mov ecx, DWORD PTR [rdi+4]   <-- ***load P->B** cmovne ecx, DWORD PTR [rdi]  <-- ***load P->A*** imul edx lea eax, [rdx+rsi] sar esi, 31 sar eax, 6 sub eax, esi add eax, ecx ret

So it appears that gcc is allowed to speculatively load both struct members in order to eliminate branching. But then, is the following code considered undefined behavior or is the gcc optimization above illegal?

#include <stdlib.h>    int naughty_caller(int c) {   Pair *P = (Pair*)malloc(sizeof(Pair)-1); // *** Allocation is enough for A but not for B ***   if (!P) return -1;    P->A = 0x42; // *** Initializing allocation only where it is guaranteed to be allocated ***    int res = foo(P, 1); // *** Passing c=1 to foo should ensure only P->A is accessed? ***    free(P);   return res; }

If the load speculation will happen in the above scenario there is a chance that loading P->B will cause an exception because the last byte of P->B may lie in unallocated memory. This exception will not happen if the optimization is turned off.

The Question

Is the gcc optimization shown above of load speculation legal? Where does the spec say or imply that it's ok? If the optimization is legal, how is the code in 'naughtly_caller' turn out to be undefined behavior?

797

asked Oct 02 '17 08:10

zr.

2 Answers

Reading a variable (that was not declared as volatile) is not considered to be a "side effect" as specified by the C standard. So the program is free to read a location and then discard the result, as far as the C standard is concerned.

This is very common. Suppose you request 1 byte of data from a 4 byte integer. The compiler may then read the whole 32 bits if that's faster (aligned read), and then discard everything but the requested byte. Your example is similar to this but the compiler decided to read the whole struct.

Formally this is found in the behavior of "the abstract machine", C11 chapter 5.1.2.3. Given that the compiler follows the rules specified there, it is free to do as it pleases. And the only rules listed are regarding volatile objects and sequencing of instructions. Reading a different struct member in a volatile struct would not be ok.

As for the case of allocating too little memory for the whole struct, that's undefined behavior. Because the memory layout of the struct is usually not for the programmer to decide - for example the compiler is allowed to add padding at the end. If there's not enough memory allocated, you might end up accessing forbidden memory even though your code only works with the first member of the struct.

105

answered Sep 23 '22 08:09

Lundin

No, if *P is allocated correctly P->B will never be in unallocated memory. It might not be initialized, that is all.

The compiler has every right to do what they do. The only thing that is not allowed is to oops about the access of P->B with the excuse that it is not initialized. But what and how they do all of this is under the discretion of the implementation and not your concern.

If you cast a pointer to a block returned by malloc to Pair* that is not guaranteed to be wide enough to hold a Pair the behavior of your program is undefined.

answered Sep 25 '22 08:09

Jens Gustedt

Related questions
                            
                                Why C-forkbombs don't work like bash ones?
                            
                                What's the difference between sockaddr, sockaddr_in, and sockaddr_in6?
                            
                                Smart pointers/safe memory management for C?
                            
                                What does this expression mean, and why does it compile? [duplicate]
                            
                                How to dynamically allocate memory space for a string and get that string from user?
                            
                                What's the meaning of the %m formatting specifier?
                            
                                Linker error: "linker input file unused because linking not done", undefined reference to a function in that file
                            
                                Understanding the difference between f() and f(void) in C and C++ once and for all
                            
                                String termination - char c=0 vs char c='\0'
                            
                                What's the meaning of "reserved for any use"?
                            
                                Is returning a pointer to a static local variable safe?
                            
                                How to make an HTTP get request in C without libcurl?
                            
                                SSE intrinsic functions reference
                            
                                getc() vs fgetc() - What are the major differences?
                            
                                Will a `char` always-always-always have 8 bits?
                            
                                Is there a REPL for C programming? [closed]
                            
                                Increasing camera capture resolution in OpenCV
                            
                                In a C function declaration, what does "..." as the last parameter do?
                            
                                variably modified array at file scope in C
                            
                                What primitive data type is time_t? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is gcc allowed to speculatively load from a struct?

Tags:

c

x86

compiler-optimization

gcc

assembly