Recently I stumbled over a comparison between Rust and C and they use the following code:
bool f(int* a, const int* b) { *a = 2; int ret = *b; *a = 3; return ret != 0; }
In Rust (same code, but with Rust syntax), it produces the following Assembler Code:
cmp dword ptr [rsi], 0 mov dword ptr [rdi], 3 setne al ret
While with gcc it produces the following:
mov DWORD PTR [rdi], 2 mov eax, DWORD PTR [rsi] mov DWORD PTR [rdi], 3 test eax, eax setne al ret
The text claims that the C function can't optimize the first line away, because a
and b
could point to the same number. In Rust this is not allowed so the compiler can optimize it away.
Now to my question:
The function takes a const int*
which is a pointer to a const int. I read this question and it states that modifying a const int with a pointer should result in a compiler warning and in the worst cast in UB.
Could this function result in a UB if I call it with two pointers to the same integer?
Why can't the C compiler optimize the first line away, under the assumption, that two pointers to the same variable would be illegal/UB?
Link to godbolt
Because the data type being pointed to is const, the value being pointed to can't be changed. We can also make a pointer itself constant. A const pointer is a pointer whose address can not be changed after initialization.
Compiler can optimize away this const by not providing storage for this variable; instead it can be added to the symbol table. So a subsequent read just needs indirection into the symbol table rather than instructions to fetch value from memory.
Why can't the C Compiler optimize the first line away, under the assumption, that two pointers to the same variable would be illegal/UB?
Because you haven't instructed the C compiler to do so -- that it is allowed to make that assumption.
C has a type qualifier for exactly this called restrict
which roughly means: this pointer does not overlap with other pointers (not exactly, but play along).
The assembly output for
bool f(int* restrict a, const int* b) { *a = 2; int ret = *b; *a = 3; return ret != 0; }
is
mov eax, DWORD PTR [rsi] mov DWORD PTR [rdi], 3 test eax, eax setne al ret
... which removes/optimizes-away the assignment *a = 2
From https://en.wikipedia.org/wiki/Restrict
In the C programming language, restrict is a keyword that can be used in pointer declarations. By adding this type qualifier, a programmer hints to the compiler that for the lifetime of the pointer, only the pointer itself or a value directly derived from it (such as pointer + 1) will be used to access the object to which it points.
The function int f(int *a, const int *b);
promises to not change the contents of b
through that pointer... It makes no promises regarding access to variables through the a
pointer.
If a
and b
point to the same object, changing it through a
is legal (provided the underlying object is modifiable, of course).
Example:
int val = 0; f(&val, &val);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With