Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

inline function in different translation units with different compiler flags undefined behaviour?

in visual studio you can set different compiler options for individual cpp files. for example: under "code generation" we can enable basic runtime checks in debug mode. or we can change the floating point model (precise/strict/fast). these are just examples. there are plenty of different flags.

an inline function can be defined multiple times in the program, as long as the definitions are identical. we put this function into a header and include it in several translation units. now, what happens if different compiler options in different cpp files lead to slightly different compiled code for the function? then they do differ and we have undefined behaviour? you could make the function static (or put it into an unnamed namespace) but going further, every memberfunction defined directly in a class is implicit inline. this would mean that we may only include classes in different cpp files if these cpp files share the identical compiler flags. i can not imagine this to be true, because this would basically be to easy to get wrong.

are we really that fast in the land of undefined behaviour? or will compilers handle this cases?

like image 250
phön Avatar asked Aug 28 '18 05:08

phön


2 Answers

As far as the Standard is concerned, each combination of command-line flags turns a compiler into a different implementation. While it is useful for implementations to be able to use object files produced by other implementations, the Standard imposes no requirement that they do so.

Even in the absence of in-lining, consider having the following function in one compilation unit:

char foo(void) { return 255; }

and the following in another:

char foo(void);
int arr[128];
void bar(void)
{
  int x=foo();
  if (x >= 0 && x < 128)
     arr[x]=1;
}

If char was a signed type in both compilation units, the value of x in the second unit would be less than zero (thus skipping the array assignment). If it were an unsigned type in both units, it would be greater than 127 (likewise skipping the assignment). If one compilation unit used a signed char and the other used unsigned, however, and if the implementation expected return values to sign-extended or zero-extended in the result register, the result could be that a compiler might determine that x can't be greater than 127 even though it holds 255, or that it couldn't be less than 0 even though it holds -1. Consequently, the generated code might access arr[255] or arr[-1], with potentially-disastrous results.

While there are many cases where it should be safe to combine code using different compiler flags, the Standard makes no effort to distinguish those where such mixing is safe from those where it is unsafe.

like image 113
supercat Avatar answered Oct 21 '22 05:10

supercat


I recently wrote some code for GCC test if this problem actually exists.

SPOILER: it does.

Setup:

I'm compiling some of our code with use of AVX512 instructions. Since most cpus don't support AVX512, we need to compile most of our code without AVX512. The questions is: whether inline function, used in a cpp file compiled with AVX512 can "poison" the whole library with illegal instructions.

Imagine a case where a function from non-AVX512 cpp file calls our function, but it hits an assembly coming from AVX512 compiled unit. This would give us illegal instruction on non AVX512 machines.

Let's give it a try:

func.h

inline void __attribute__ ((noinline)) double_it(float* f) {
  for (int i = 0; i < 16; i++)
    f[i] = f[i] + f[i];
}

We define an inline (in a linker sense) function. Using hard-coded 16 will make GCC optimizer use AVX512 instructions. We have to make it ((noinline)) to prevent the compiler from inlining it (i.e. pasting it's code to callers). This is a cheap way to pretend this function is too long to be worth inlining.

avx512.cpp

#include "func.h"
#include <iostream>

void run_avx512() {
  volatile float f = 1;
  float arr [16] = {f};
  double_it(arr);
  for (int i = 0; i < 16; i++)
    std::cout << arr[i] << " ";
  std::cout << std::endl;
}

This is AVX512 use of our double_it function. It doubles some array and prints the result. We will compile it with AVX512.

non512.cpp

#include "func.h"
#include <iostream>

void run_non_avx() {
  volatile float f = 1;
  float arr [16] = {f};
  double_it(arr);
  for (int i = 0; i < 16; i++)
    std::cout << arr[i] << " ";
  std::cout << std::endl;
}

Same logic as before. This one won't be compiled with AVX512.

lib_user.cpp

void run_non_avx();

int main() {
  run_non_avx();
}

Some user code. Calls `run_non_avx that was compiled without AVX512. It doesn't know it's gonna blob up :)

Now we can compile these files and link them as shared library (probably regular lib would work as well)

g++ -c avx512.cpp -o avx512.o -O3 -mavx512f -g3 -fPIC
g++ -c non512.cpp -o non512.o -O3 -g3 -fPIC
g++ -shared avx512.o non512.o -o libbad.so
g++ lib_user.cpp -L . -lbad -o lib_user.x
./lib_user.x

Running this on my machine (no AVX512) gives me

$ ./lib_user.x
Illegal instruction (core dumped)

On a side note, if I change the order of avx512.o non512.o, it starts working. It seems linker ignores subsequent implementations of the same functions.

like image 35
S. Kaczor Avatar answered Oct 21 '22 06:10

S. Kaczor