Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling C++ in the presence of aggressive inlining?

Tags:

c++

profiling

I am trying to figure out where my C++ program is spending its time, using gprof. Here's my dilemma: if I compile with the same optimization settings I use for my release build, pretty much everything gets inlined, and gprof tells me, unhelpfully, that 90% of my time is spent in a core routine, where everything was inlined. On the other hand, if I compile with inlining disabled, the program runs an order of magnitude slower.

I want to find out how much time procedures called from my core routine are taking, when my program is compiled with inlining enabled.

I am running 64-bit Ubuntu 9.04 on a quad-core Intel machine. I looked into google-perftools, but that doesn't seem to work well on x86_64. Running on a 32-bit machine is not an option.

Does anyone have suggestions as to how I can more effectively profile my application, when inlining is enabled?

Edit: Here is some clarification of my problem. I apologize if it was not clear initially.

I want to find where the time was being spent in my application. Profiling my optimized build resulted in gprof telling me that ~90% of the time is spent in main, where everything was inlined. I already knew that before profiling!

What I want to find out is how much time the inlined functions are taking, preferably, without disabling optimization or inlining in my build options. The application is something like an order of magnitude slower when profiling with inlining disabled. This difference in execution time is a convenience issue, but also, I am not confident that the performance profile of the program built with inlining disabled will strongly correspond to the performance profile of the program built with inlining enabled.

In short: is there a way to get useful profiling information on a C++ program without disabling optimization or inlining?

like image 538
Brad Larsen Avatar asked Jan 18 '10 17:01

Brad Larsen


People also ask

What is code inlining?

Inlining is the process of replacing a subroutine or function call at the call site with the body of the subroutine or function being called. This eliminates call-linkage overhead and can expose significant optimization opportunities.

What are the conditions of inlining a function?

Inline function may increase efficiency if it is small. 2) If a function contains static variables. 3) If a function is recursive. 4) If a function return type is other than void, and the return statement doesn't exist in function body.

What is inlining in C++?

An inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. This eliminates call-linkage overhead and can expose significant optimization opportunities.

What is automatic inlining?

Any function defined within the class definition is automatically declared inline. The length of the function's body doesn't matter for that. Whether the function will actually be inlined in the generated machine code, is an entirely separate and largely unrelated question .


2 Answers

I assume what you want to do is find out which lines of code are costing you enough to be worth optimizing. That is very different from timing functions. You can do better than gprof.

Here's a fairly complete explanation of how to do it.

You can do it by hand, or use one of the profilers that can provide the same information, such as oprofile, and RotateRight/Zoom.

BTW, inlining is of significant value only if the routines being inlined are small and don't call functions themselves, and if the lines where they are being called are active enough of the time to be significant.

As for the order of magnitude performance ratio between debug and release build, it may be due to a number of things, maybe or maybe not the inlining. You can use the stackshot method mentioned above to find out for certain just what's going on in either case. I've found that debug builds can be slow for other reasons, like recursive data structure validation, for example.

like image 114
Mike Dunlavey Avatar answered Sep 27 '22 22:09

Mike Dunlavey


You can use a more powerful profiler, such as Intel's VTune, which can give you assembly line level of performance detail.

http://software.intel.com/en-us/intel-vtune/

It's for Windows and Linux, but does cost money...

like image 27
Inverse Avatar answered Sep 27 '22 21:09

Inverse