Can someone please clarify whether (and why) a function can be attributed pure
or const
if it has a pointer parameter.
According to the GCC documentation:
Some of common examples of pure functions are strlen or memcmp.
The whole point of a pure function is that it need only be called once for the same parameters, i.e. the result can be cached if the compiler thinks it fit to do so, however how does this work for memcmp?
for example:
char *x = calloc(1, 8);
char *y = calloc(1, 8);
if (memcmp(x, y, 8) > 0)
printf("x > y\n");
x[1] = 'a';
if (memcmp(x, y, 8) > 0)
printf("x > y\n");
The parameters to the second call to memcmp are identical to the first (the pointers point to the same address), how does the compiler know not to use the result from the first call, if memcmp
is pure?
In my case I want to pass an array to a pure function, and calculate the result based on the array alone. Someone reassure me that this is okay, and that when values in the array change but the address does not, my function will be called correctly.
If I understood the documentation correctly, a pure
function can depend on values of the memory, where the compiler knows whenever the memory changes. Moreover, a pure
function can not change the state of the program, such as a global variable, it only produces a return value.
In your example code, memcmp
can be a pure
function. The compiler sees that the memory is changed between the calls to memcmp
, and can not reuse the result of the first call for the second call.
On the other hand, memcmp
can not be declared as a const
function, since it depends on data in memory.
If it was const
, the compiler could apply more aggressive optimizations.
For this reason, it seems safe to declare the function that you want to implement as pure
(but not const
).
With respect to pure we can see from the article Implications of pure and constant functions that pure means that the function does not have side effects and only depends on the parameters.
So if the compiler can determine that the arguments are the same, and memory has not changed between subsequent calls it can eliminate the subsequent calls to the pure function since it knows the pure function does not have side effects.
Which means the compiler has to do analysis to be able to determine if the arguments to the pure function could have been modified before it can decide to eliminate subsequent calls to a pure function for the same arguments.
An example from the article is as follows:
int someimpurefunction(int a);
int somepurefunction(int a)
__attribute__((pure));
int testfunction(int a, int b, int c, int d) {
int res1 = someimpurefunction(a) ? someimpurefunction(a) : b;
int res2 = somepurefunction(a) ? somepurefunction(a) : c;
int res3 = a+b ? a+b : d;
return res1+res2+res3;
}
and it shows the optimized assembly generated which shows that somepurefunction
was only called once and then says:
As you can see, the pure function is called just once, because the two references inside the ternary operator are equivalent, while the other one is called twice. This is because there was no change to global memory known to the compiler between the two calls of the pure function (the function itself couldn't change it – note that the compiler will never take multi-threading into account, even when asking for it explicitly through the -pthread flag), while the non-pure function is allowed to change global memory or use I/O operations.
This logic also applies to a pointer, so if the compiler can prove the memory pointed to the pointer has not been modified then it can eliminate the call to the pure function so in your case when the compiler sees:
x[1] = 'a';
it can not eliminate the second call to memcmp
because memory pointed to by x
has changed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With