Motivation: I cant get google cpu profiler to work on machine where code runs(with my last breath I curse libunwind :)), so I was wondering if the gdb supports high frequency random pausing of the program execution, storing the name of the function where break occured and counting how many times it paused in function x. That is what I would call "run time sampling", there is probably more precise/smarter name. I looked at the oprofile, but it is to complicated to a) figure out if it can do it b) to figure out how to do it EDIT: apparently correct name is: "statistical sampling method"
EDIT2: reason why Im offering a bounty for this is that I see some ppl on SO recommending doing manual break 10-20x and examining stack with bt...
Seems very wasteful when it comes to time, so I guestimate some smart ppl automated it. :)
EDIT3: gprof wont cut it... i tried running it recently on ARM system and output was trash... :( I guess its troubles with multithreading is the reason for that...
You can manually sample in GDB by pausing it at run time.
What you seem to think you want is gprof, but if your goal is to make the program as fast as possible, then I would suggest
High frequency of sampling is not helpful.
Counting the number of samples where the program counter is in function X is not helpful except in artificially small programs.
If you follow that link, you will see the reasons why, and directions for how to do it successfully.
GDB would not do this well, although you could certainly hack something up that gave wildly inaccurate results.
I'd suggest Valgrind's "Callgrind" plugin. As a bonus it requires absolutely no recompilation or other special setup. All you need is valgrind installed in your system, and debug information in your program (or, full symbol information, at least; I'm not sure).
You then invoke your program like this:
valgrind --tool=callgrind <your program command line>
When it's done there will be a file name callgrind.out.<pid>
in the current directory. You can read and visualise this file with a very nice GUI tool called kcachegrind
(usually you have to install this separately).
The only problem is that, because callgrind slows the execution of your program slightly, the time spent in system calls can appear smaller (in percentage terms) than it really would be. By default, callgrind does not include system time in its counters, so the values it give you are a real comparison of the code in your program, if not the actual time 'under' that function. This can be confusing, at first, so if that happens you try adding --collect-systime=yes
.
I'm not sure what the state of callgrind on ARM might be. ARMv7 is listed as a supported platform, but only says "fairly complete", whatever that means.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With