The output of a typical profiler is, a list of functions in your code, sorted by the amount of time each function took while the program ran. This is very good, but sometimes I'm interested more with what was program doing most of the time, than with where was <code>EIP</code> most of the time. An example output of my hypothetical profiler is: <pre class="prettyprint"><code>Waiting for file IO - 19% of execution time. Waiting for network - 4% of execution time Cache misses - 70% of execution time. Actual computation - 7% of execution time. </code></pre> Is there such a profiler? Is it possible to derive such an output from a "standard" profiler? I'm using Linux, but I'll be glad to hear any solutions for other systems.

Please take a look at this and this. Consider any thread. At any instant of time it is doing something, and it is doing it for a reason, and slowness can be defined as the time it spends for poor reasons - it doesn't need to be spending that time. Take a snapshot of the thread at a point in time. Maybe it's in a cache miss, in an instruction, in a statement, in a function, called from a call instruction in another function, called from another, and so on, up to <code>call _main</code>. Every one of those steps has a reason, that an examination of the code reveals. <ol> <li>If any one of those steps is not a very good reason and could be avoided, that instant of time does not need to be spent.</li> </ol> Maybe at that time the disk is coming around to certain sector, so some data streaming can be started, so a buffer can be filled, so a read statement can be satisfied, in a function, and that function is called from a call site in another function, and that from another, and so on, up to <code>call _main</code>, or whatever happens to be the top of the thread. <ol> <li>Repeat previous point 1.</li> </ol> So, the way to find bottlenecks is to find when the code is spending time for poor reasons, and the best way to find that is to take snapshots of its state. The EIP, or any other tiny piece of the state, is not going to do it, because it won't tell you why. Very few profilers "get it". The ones that do are the wall-clock-time stack-samplers that report by line of code (not by function) percent of time active (not amount of time, especially not "self" or "exclusive" time.) One that does is Zoom, and there are others. Looking at where the EIP hangs out is like trying to tell time on a clock with only a second hand. Measuring functions is like trying to tell time on a clock with some of the digits missing. Profiling only during CPU time, not during blocked time, is like trying to tell time on a clock that randomly stops running for long stretches. Being concerned about measurement precision is like trying to time your lunch hour to the second. This is not a mysterious subject.

Profiling program by type of activity

Tags:

c++

c

profiling

The output of a typical profiler is, a list of functions in your code, sorted by the amount of time each function took while the program ran.

This is very good, but sometimes I'm interested more with what was program doing most of the time, than with where was EIP most of the time.

An example output of my hypothetical profiler is:

Waiting for file IO - 19% of execution time.
Waiting for network -  4% of execution time
Cache misses        - 70% of execution time.
Actual computation  -  7% of execution time.

Is there such a profiler? Is it possible to derive such an output from a "standard" profiler?

I'm using Linux, but I'll be glad to hear any solutions for other systems.

761

asked Feb 08 '11 19:02

Elazar Leibovich

2 Answers

This is Solaris only, but dtrace can monitor almost every kind of I/O, on/off CPU, time in specific functions, sleep time, etc. I'm not sure if it can determine cache misses though, assuming you mean CPU cache - I'm not sure if that information is made available by the CPU or not.

116

answered Oct 05 '22 13:10

Mark B

Please take a look at this and this.

Consider any thread. At any instant of time it is doing something, and it is doing it for a reason, and slowness can be defined as the time it spends for poor reasons - it doesn't need to be spending that time.

Take a snapshot of the thread at a point in time. Maybe it's in a cache miss, in an instruction, in a statement, in a function, called from a call instruction in another function, called from another, and so on, up to call _main. Every one of those steps has a reason, that an examination of the code reveals.

If any one of those steps is not a very good reason and could be avoided, that instant of time does not need to be spent.

Maybe at that time the disk is coming around to certain sector, so some data streaming can be started, so a buffer can be filled, so a read statement can be satisfied, in a function, and that function is called from a call site in another function, and that from another, and so on, up to call _main, or whatever happens to be the top of the thread.

Repeat previous point 1.

So, the way to find bottlenecks is to find when the code is spending time for poor reasons, and the best way to find that is to take snapshots of its state. The EIP, or any other tiny piece of the state, is not going to do it, because it won't tell you why.

Very few profilers "get it". The ones that do are the wall-clock-time stack-samplers that report by line of code (not by function) percent of time active (not amount of time, especially not "self" or "exclusive" time.) One that does is Zoom, and there are others.

Looking at where the EIP hangs out is like trying to tell time on a clock with only a second hand. Measuring functions is like trying to tell time on a clock with some of the digits missing. Profiling only during CPU time, not during blocked time, is like trying to tell time on a clock that randomly stops running for long stretches. Being concerned about measurement precision is like trying to time your lunch hour to the second.

This is not a mysterious subject.

answered Oct 05 '22 13:10

Mike Dunlavey

Related questions
                            
                                why does the BTNS_DROPDOWN style cause the whole toolbar to move down a couple pixels?
                            
                                Simple USB host stack
                            
                                Capture Screen Image in C++ on OSX
                            
                                Is there a way to get better information for the context of an error when using msvc? (ex: C2248)
                            
                                Writing binary files using C++: does the default locale matter?
                            
                                Cubic Spline Interpolation in C++
                            
                                Of these 3 methods for reading linked lists from shared memory, why is the 3rd fastest?
                            
                                Multiple rows with a single INSERT in SQLServer 2008
                            
                                How to shrink-to-fit an std::vector in a memory-efficient way?
                            
                                Generalized plugable caching pattern?
                            
                                how to organize test cases with boost::test library?
                            
                                building using multiple machines
                            
                                How does malloc_info() work?
                            
                                operator precedence (void* before bool?)
                            
                                Is it possible to regenerate symbols for an exe?
                            
                                Packaging Linux software while keeping a sane file structure
                            
                                GenericFactory as Singleton
                            
                                SPOJ Problem KPRIMES2
                            
                                C++ Logging Library Setup
                            
                                Finding static initializers and destructors in C++

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With