Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Functions only getting inlined if defined in a header. Am I missing something?

Using gcc v4.8.1

If I do:

//func.hpp

#ifndef FUNC_HPP
#define FUNC_HPP

int func(int);

#endif

//func.cpp

#include "func.hpp"

int func(int x){
    return 5*x+7;
}

//main.cpp

#include <iostream>

#include "func.hpp"

using std::cout;
using std::endl;

int main(){
    cout<<func(5)<<endl;
    return 0;
}

Even the simple function func will not get inlined. No combination of inline, extern, static, and __attribute__((always_inline)) on the prototype and/or the definition changes this (obviously some combinations of these specifiers cause it to not even compile and/or produce warnings, not talking about those). I'm using g++ *.cpp -O3 -o run and g++ *.cpp -O3 -S for assembly output. When I look at the assembly output, I still see call func. It appears only way I can get the function to be properly inlined is to have the prototype (probably not necessary) and the definition of the function in the header file. If the header is only included by one file in the whole program (included by only main.cpp for example) it will compile and the function will be properly inlined without even needing the inline specifier. If the header is to be included by multiple files, the inline specifier appears to be needed to resolve multiple definition errors, and that appears to be its only purpose. The function is of course inlined properly.

So my question is: am I doing something wrong? Am I missing something? Whatever happened to:

"The compiler is smarter than you. It knows when a function should be inlined better than you do. And never ever use C arrays. Always use std::vector!"

-Every other StackOverflow user

Really? So calling func(5) and printing the result is faster than just printing 32? I will blindly follow you off the edge of a cliff all mighty all knowing and all wise gcc.

For the record, the above code is just an example. I am writing a ray tracer and when I moved all of the code of my math and other utility classes to their header files and used the inline specifier, I saw massive performance gains. Literally like 10 times faster for some scenes.

like image 476
Mike Avatar asked Oct 22 '13 12:10

Mike


People also ask

How do you know if a function is inlined?

The most reliable way to see if a function is being inlined or not is to look at the output from the compiler. Most compilers have a switch to output assembler code for your inspection.

Do inline functions have to be defined in header?

The definition of an inline function doesn't have to be in a header file but, because of the one definition rule (ODR) for inline functions, an identical definition for the function must exist in every translation unit that uses it. The easiest way to achieve this is by putting the definition in a header file.

Should inline functions be always defined in a header file if multiple source files have to use them?

Any C++ function may be declared inline. But if the inline function is a public member function (a.k.a., public method) of the class it is necessary to place the code for the inline function inside the header file.

Why do we need inline function?

An inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. This eliminates call-linkage overhead and can expose significant optimization opportunities.


3 Answers

Recent GCC is able to inline across compilation units through link-time optimizations (LTO). You need to compile - and link - with -flto; see Link-time optimization and inline and GCC optimize options.

(Actually, LTO is done by a special variant lto1 of the compiler at link time; LTO works by serializing, inside the object files, some internal representations of GCC, which are also used by lto1; so what happens with -flto is that when compiling a src1.c with it the generated src1.o contains the GIMPLE representations in addition of the object binary; and when linking with gcc -flto src*.o the lto1 "front-end" is extracting that GIMPLE representations from inside the src*.o and almost recompiling all again...)

You need to explicitly pass -flto both at compile time AND at link time (see this). If using a Makefile you could try make CC='gcc -flto'; otherwise, compile each translation unit with e.g. gcc -Wall -flto -O2 -c src1.c (and likewise for src2.c etc...) and link all of your program (or library) with gcc -Wall -flto -O2 src1.o src2.o -o prog -lsomelib

Notice that -flto will significantly slow down your build (it is not passed by -O3 so you need to use it explicitly, and you need to link with it also). Often you get a 5% or 10% improvement of performance -of the built program- at the expense of nearly doubling the build time. Sometimes you can get more improvements.

like image 162
Basile Starynkevitch Avatar answered Sep 18 '22 11:09

Basile Starynkevitch


The compiler can't inline what it doesn't have. It needs the full body of the function to inline its code.

You have to remember that the compiler only works on one source file at a time (more precisely, one translation unit at a time), and have no idea about other source files and whats in them.

The linker might be able to do it though, as it sees all the code, and some linkers have flags that allows some link-time optimizations.

like image 31
Some programmer dude Avatar answered Sep 22 '22 11:09

Some programmer dude


The inline keyword is nothing more than a suggestion to the compiler, "i want this function to be inlined". It can ignore this keyword, without even a warning.

In order for your function func(...) to be inlined, your compiler/linker HAVE TO support some form of link-time code generation(and optimizaton). Because func() and main() lie in different code units, the C++ compiler can't see them both at the same time, and therefore can't inline one function within the other. It NEEDS the LINKER SUPPORT to do so.

Consult your build tool manuals on how to switch link time code gen features on, if they are supported at all.

like image 21
Pavel Beliy Avatar answered Sep 19 '22 11:09

Pavel Beliy