Like the title says, I'd like to know which is the difference between the two commands and the results they produce.
From reading this I understand gprof requires an extra flag (-g, the same needed for gdb?), but that doesn't give me an answer about the difference between the profilers.
The GNU profiler gprof is a useful tool for measuring the performance of a program--it records the number of calls to each function and the amount of time spent there, on a per-function basis. Functions which consume a large fraction of the run-time can be identified easily from the output of gprof .
Description. The gprof command produces an execution profile of C, Pascal, FORTRAN, or COBOL programs. The effect of called routines is incorporated into the profile of each caller. The gprof command is useful in identifying how a program consumes CPU resource.
gprof
was invented specifically because prof
only gives you "self time".
"self time" tells you what fraction of the total time the program counter is found in each routine.
That's fine if the only kinds of "bottlenecks" you need to consider are solved by shaving cycles at the bottom of the call stack.
What about the kind of "bottleneck" that you solve by reducing the number of calls to subroutines?
That's what gprof
was supposed to help you find, by "charging back" self time to callers.
"inclusive time" in a routine consists of its "self time", plus what bubbles up from the routines it calls.
Here's a simple example of what gprof
does:
A calls B ten times, and B calls C ten times, and C does some CPU-intensive loop for 10 sample times.
Notice that none of the samples land in A or B because the program counter spends practically all of its time in C.
gprof
, by counting the calls, and keeping track of who made them, does some math and propagates time upward.
So you see, a way to speed up this program is to have A call B fewer times, or have B call C fewer times, or both.
prof
could not give you that kind of information - you'd have to guess.
If you're thinking of using gprof
, be aware of its problems, like not giving line-level information, being blind to I/O, getting confused by recursion, and misleading you into thinking certain things help you find problems, like sample rate, call graph, invocation counts, etc.
Keep in mind that its authors only claimed it was a measurement tool, not a problem-finding tool, even though most people think it's the latter.
What I understand is that both of the commands output a similar flat file who looks like :
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls us/call us/call name
37.50 0.15 0.15 48000 3.12 3.12 Life::neighbor_count(int, int)
17.50 0.22 0.07 _IO_do_write
10.00 0.26 0.04 __overflow
7.50 0.29 0.03 _IO_file_overflow
7.50 0.32 0.03 _IO_putc
5.00 0.34 0.02 12 1666.67 14166.67 Life::update(void)
5.00 0.36 0.02 stdiobuf::overflow(int)
5.00 0.38 0.02 stdiobuf::sys_write(char const *, int)
2.50 0.39 0.01 ostream::operator<<(char)
2.50 0.40 0.01 internal_mcount
0.00 0.40 0.00 12 0.00 0.00 Life::print(void)
0.00 0.40 0.00 12 0.00 0.00 to_continue(void)
0.00 0.40 0.00 1 0.00 0.00 Life::initialize(void)
0.00 0.40 0.00 1 0.00 0.00 instructions(void)
0.00 0.40 0.00 1 0.00 170000.00 main
The flat profile shows the total amount of time your program spent executing each function.
gprof seems to output one more file representing the call-graph, it looks like :
Call graph :
granularity: each sample hit covers 4 byte(s) for 2.50% of 0.40 seconds
index % time self children called name
0.02 0.15 12/12 main [2]
[1] 42.5 0.02 0.15 12 Life::update(void) [1]
0.15 0.00 48000/48000 Life::neighbor_count(int, int) [4]
-----------------------------------------------
0.00 0.17 1/1 _start [3]
[2] 42.5 0.00 0.17 1 main [2]
0.02 0.15 12/12 Life::update(void) [1]
0.00 0.00 12/12 Life::print(void) [13]
0.00 0.00 12/12 to_continue(void) [14]
0.00 0.00 1/1 instructions(void) [16]
0.00 0.00 1/1 Life::initialize(void) [15]
-----------------------------------------------
[3] 42.5 0.00 0.17 _start [3]
0.00 0.17 1/1 main [2]
-----------------------------------------------
0.15 0.00 48000/48000 Life::update(void) [1]
[4] 37.5 0.15 0.00 48000 Life::neighbor_count(int, int) [4]
-----------------------------------------------
The call graph shows how much time was spent in each function and its children. From this information, you can find functions that, while they themselves may not have used much time, called other functions that did use unusual amounts of time.
Finally : It seems that prof only shows you the time spent in a function while gprof shows the call graph to show what is taking time also inside the functions. With that you can find the routines who are taking times inside the functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With