C and C++ source code profiling tools [duplicate]

Tags:

Possible Duplicate:
What's your favorite profiling tool (for C++)

Are there any good tools to profile a source code which is mix of of C and C++. What are the pros and cons of any, and which ones have you used and would reccomend for usage. Please do not get me a list of tools from google. I can do that too, what i want is to leverage the personal experience of someone who has used these tools and knows the pros and cons about them.
Thanks in advance.

993

asked Nov 10 '10 04:11

Alok Save

2 Answers

I've found gprof to be the best CPU hotspot profiler, and Google Performance Tools to be the best sampling profiler. Both work for C and C++.

In my opinion there are no good profiling tools on Windows.

GNU gprof pros and cons

GCC only
Works with C and C++
Only treats CPU time, and code inside the binary, you need everything you wish to profile statically linked in
Very accurate
Adds a small overhead to execution

Google Performance Tools pros and cons

I think it requires the GNU tool chain
Occasionally fails to identify symbols
Very customizable
Outputs to a huge variety of formats, including the Callgrind format, and automatically loads KCacheGrind for you
Has various memory profiling tools also
Is a sampling profiler, with minimal overhead

Related useful questions and answers

Alternative to -pg with Clang?
What's your favorite profiling tool (for C++)
Alternatives to gprof
C++ Code Profiler
Confusing gprof output

answered Oct 11 '22 06:10

Matt Joiner

I would respectfully disagree with Matt.

The tool I use all the time on Windows is the random-pausing technique, and it works with all languages that the IDE supports.

As an example of using it to do performance tuning, this case shows how a speedup of 43 times was achieved through a series of steps.

Gprof has a lot of problems, listed here, and according to the google-perftools manual, some of the same issues are repeated there, such as reporting procedures, not lines, emphasizing self (local) time, emphasizing the graph, etc. (I can't tell from the doc if it samples while blocked.)

As software systems become ever larger, self time becomes less and less relevant. The program counter spends most of its time in library routines or blocked in the system. Graphs become gigantic nests. People ask "I know function X is costly, but where in function X is the problem?" What's more, the "bottlenecks" get bigger and bigger, because the stack gets deeper on average, and every layer of the stack is a fresh opportunity to do more function calls than necessary.

An example of a stack-sampler that reports percent by line, and samples while blocked, and allows user control of sampling so as not to dilute the sample set during user input, is Zoom.

EDIT: Sorry, can't leave well enough alone. Here's a new explanation:

The way programs work, they trace out a call tree, which is a lot like the oak tree outside my window. It has a trunk (main) which sprouts branches (call sites) which sprout further branches for several levels out to leaves (instructions) and acorns (blocking calls).

When the tree surgeon comes to prune (optimize) it, does he look only where the leaves are (hotspots)? Does he ignore acorns (no samples during blocking)? No, he looks for branches (call sites) that are both heavy (on the stack a lot) and unhealthy (unnecessary). Those are what he prunes. That's what random-pausing and Zoom do, is help find those call sites.

answered Oct 11 '22 04:10

Mike Dunlavey

Related questions
                            
                                Find attribute names that start with a certain pattern
                            
                                How do you use JSON.stringify in a custom toJSON method?
                            
                                Is it a good practice to define an empty delegate body for a event? [duplicate]
                            
                                Regular expression for a list of items separated by comma or by comma and a space
                            
                                AttributeError: 'list' object has no attribute 'encode'
                            
                                Best practices for static constructors
                            
                                What is dispatching in JAVA?
                            
                                Accessing image dimensions on an ImageField in a Django template?
                            
                                Ghost Script - extract a single page from a pdf and convert it to a jpg
                            
                                Django: When to customize save vs using post-save signal
                            
                                Removing %20 from URI Relative Path
                            
                                Excel VBA: Parsed JSON Object Loop

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With