Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Effects of declaring a function as pure or const to GCC, when it isn't

GCC can suggest functions for attribute pure and attribute const with the flags -Wsuggest-attribute=pure and -Wsuggest-attribute=const.

The GCC documentation says:

Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure.

But what can happen if you attach __attribute__((__pure__)) to a function that doesn't match the above description, and does have side effects? Is it simply the possibility that the function will be called fewer times than you would want it to be, or is it possible to create undefined behaviour or other kinds of serious problems?

Similarly for __attribute__((__const__)) which is stricter again - the documentation states:

Basically this is just slightly more strict class than the pure attribute below, since function is not allowed to read global memory.

But what can actually happen if you attach __attribute__((__const__)) to a function that does access global memory?

I would prefer technical answers with explanations of actual possible scenarios within the scope of GCC / G++, rather than the usual "nasal demons" handwaving that appears whenever undefined behaviour gets mentioned.

like image 991
Riot Avatar asked Feb 02 '17 03:02

Riot


Video Answer


1 Answers

But what can happen if you attach __attribute__((__pure__)) to a function that doesn't match the above description, and does have side effects?

Exactly. Here's a short example:

extern __attribute__((pure)) int mypure(const char *p);

int call_pure() {
  int x = mypure("Hello");
  int y = mypure("Hello");
  return x + y;
}

My version of GCC (4.8.4) is clever enough to remove second call to mypure (result is 2*mypure()). Now imagine if mypure were printf - the side effect of printing string "Hello" would be lost.

Note that if I replace call_pure with

char s[];

int call_pure() {
  int x = mypure("Hello");
  s[0] = 1;
  int y = mypure("Hello");
  return x + y;
}

both calls will be emitted (because assignment to s[0] may change output value of mypure).

Is it simply the possibility that the function will be called fewer times than you would want it to be, or is it possible to create undefined behaviour or other kinds of serious problems?

Well, it can cause UB indirectly. E.g. here

extern __attribute__((pure)) int get_index();

char a[];
int i;
void foo() {
  i = get_index();  // Returns -1
  a[get_index()];  // Returns 0
}

Compiler will most likely drop second call to get_index() and use the first returned value -1 which will result in buffer overflow (well, technically underflow).

But what can actually happen if you attach __attribute__((__const__)) to a function that does access global memory?

Let's again take the above example with

int call_pure() {
  int x = mypure("Hello");
  s[0] = 1;
  int y = mypure("Hello");
  return x + y;
}

If mypure were annotated with __attribute__((const)), compiler would again drop the second call and optimize return to 2*mypure(...). If mypure actually reads s, this will result in wrong result being produced.

EDIT

I know you asked to avoid hand-waving but here's some generic explanation. By default function call blocks a lot of optimizations inside compiler as it has to be treated as a black box which may have arbitrary side effects (modify any global variable, etc.). Annotating function with const or pure instead allows compiler to treat it more like expression which allows for more aggressive optimization.

Examples are really too numerous to give. The one which I gave above is common subexpression elimination but we could as well easily demonstrate benefits for loop invariants, dead code elimination, alias analysis, etc.

like image 104
yugr Avatar answered Sep 21 '22 17:09

yugr