Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why can't the C compiler optimize changing the value of a const pointer assuming that two pointers to the same variable would be illegal/UB?

Recently I stumbled over a comparison between Rust and C and they use the following code:

bool f(int* a, const int* b) {   *a = 2;   int ret = *b;   *a = 3;   return ret != 0; } 

In Rust (same code, but with Rust syntax), it produces the following Assembler Code:

    cmp      dword ptr [rsi], 0      mov      dword ptr [rdi], 3      setne al                         ret 

While with gcc it produces the following:

   mov      DWORD PTR [rdi], 2       mov      eax, DWORD PTR [rsi]    mov      DWORD PTR [rdi], 3            test     eax, eax                      setne al                               ret 

The text claims that the C function can't optimize the first line away, because a and b could point to the same number. In Rust this is not allowed so the compiler can optimize it away.

Now to my question:

The function takes a const int* which is a pointer to a const int. I read this question and it states that modifying a const int with a pointer should result in a compiler warning and in the worst cast in UB.

Could this function result in a UB if I call it with two pointers to the same integer?

Why can't the C compiler optimize the first line away, under the assumption, that two pointers to the same variable would be illegal/UB?

Link to godbolt

like image 984
izlin Avatar asked Feb 01 '21 13:02

izlin


People also ask

Can you change the value of a const pointer?

Because the data type being pointed to is const, the value being pointed to can't be changed. We can also make a pointer itself constant. A const pointer is a pointer whose address can not be changed after initialization.

Does the compiler optimize const?

Compiler can optimize away this const by not providing storage for this variable; instead it can be added to the symbol table. So a subsequent read just needs indirection into the symbol table rather than instructions to fetch value from memory.


2 Answers

Why can't the C Compiler optimize the first line away, under the assumption, that two pointers to the same variable would be illegal/UB?

Because you haven't instructed the C compiler to do so -- that it is allowed to make that assumption.

C has a type qualifier for exactly this called restrict which roughly means: this pointer does not overlap with other pointers (not exactly, but play along).

The assembly output for

bool f(int* restrict a, const int* b) {   *a = 2;   int ret = *b;   *a = 3;   return ret != 0; } 

is

        mov     eax, DWORD PTR [rsi]         mov     DWORD PTR [rdi], 3         test    eax, eax         setne   al         ret 

... which removes/optimizes-away the assignment *a = 2

From https://en.wikipedia.org/wiki/Restrict

In the C programming language, restrict is a keyword that can be used in pointer declarations. By adding this type qualifier, a programmer hints to the compiler that for the lifetime of the pointer, only the pointer itself or a value directly derived from it (such as pointer + 1) will be used to access the object to which it points.

like image 188
Morten Jensen Avatar answered Sep 18 '22 16:09

Morten Jensen


The function int f(int *a, const int *b); promises to not change the contents of b through that pointer... It makes no promises regarding access to variables through the a pointer.

If a and b point to the same object, changing it through a is legal (provided the underlying object is modifiable, of course).

Example:

int val = 0; f(&val, &val); 
like image 42
pmg Avatar answered Sep 20 '22 16:09

pmg