One could use a profiler, but why not just halt the program? [closed]

Tags:

If something is making a single-thread program take, say, 10 times as long as it should, you could run a profiler on it. You could also just halt it with a "pause" button, and you'll see exactly what it's doing.

Even if it's only 10% slower than it should be, if you halt it more times, before long you'll see it repeatedly doing the unnecessary thing. Usually the problem is a function call somewhere in the middle of the stack that isn't really needed. This doesn't measure the problem, but it sure does find it.

Edit: The objections mostly assume that you only take 1 sample. If you're serious, take 10. Any line of code causing some percentage of wastage, like 40%, will appear on the stack on that fraction of samples, on average. Bottlenecks (in single-thread code) can't hide from it.

EDIT: To show what I mean, many objections are of the form "there aren't enough samples, so what you see could be entirely spurious" - vague ideas about chance. But if something of any recognizable description, not just being in a routine or the routine being active, is in effect for 30% of the time, then the probability of seeing it on any given sample is 30%.

Then suppose only 10 samples are taken. The number of times the problem will be seen in 10 samples follows a binomial distribution, and the probability of seeing it 0 times is .028. The probability of seeing it 1 time is .121. For 2 times, the probability is .233, and for 3 times it is .267, after which it falls off. Since the probability of seeing it less than two times is .028 + .121 = .139, that means the probability of seeing it two or more times is 1 - .139 = .861. The general rule is if you see something you could fix on two or more samples, it is worth fixing.

In this case, the chance of seeing it in 10 samples is 86%. If you're in the 14% who don't see it, just take more samples until you do. (If the number of samples is increased to 20, the chance of seeing it two or more times increases to more than 99%.) So it hasn't been precisely measured, but it has been precisely found, and it's important to understand that it could easily be something that a profiler could not actually find, such as something involving the state of the data, not the program counter.

287

asked Nov 05 '08 19:11

Mike Dunlavey

2 Answers

On Java servers it's always been a neat trick to do 2-3 quick Ctrl-Breakss in a row and get 2-3 threaddumps of all running threads. Simply looking at where all the threads "are" may extremely quickly pinpoint where your performance problems are.

This technique can reveal more performance problems in 2 minutes than any other technique I know of.

answered Oct 07 '22 06:10

krosenvold

Because sometimes it works, and sometimes it gives you completely wrong answers. A profiler has a far better record of finding the right answer, and it usually gets there faster.

answered Oct 07 '22 06:10

Paul Tomblin

Related questions
                            
                                Define TRACE Constant in .NET / Visual Studio
                            
                                How can I speed up reading multiple files and putting the data into a dataframe?
                            
                                Why emplace_back is faster than push_back?
                            
                                Is it better to execute many sql commands with one connection, or reconnect every time?
                            
                                ConfigurationManager.AppSettings Caching
                            
                                Why defining class as final improves JVM performance?
                            
                                What is the most effective way of iterating a std::vector and why?
                            
                                Is there some industry standard for unacceptable webapp response time?
                            
                                INC instruction vs ADD 1: Does it matter?
                            
                                Object pools in high performance javascript?
                            
                                System.IO.FileSystemWatcher to monitor a network-server folder - Performance considerations
                            
                                What do we mean by "top percentile" or TP based latency?
                            
                                Why is this Haskell program so much slower than an equivalent Python one?
                            
                                C++ vs Java? Why does the ICC generate slower code than VC? [closed]
                            
                                WPF DataGrid is very slow to render
                            
                                Do c++ templates make programs slow?
                            
                                How to reduce the image size without losing quality in PHP [closed]
                            
                                How does jsPerf determine which of the code snippets is fastest?
                            
                                How to get the CPU cycle count in x86_64 from C++?
                            
                                Speeding Up Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

One could use a profiler, but why not just halt the program? [closed]

Tags:

performance

optimization

profiling

Mike Dunlavey

People also ask

2 Answers

krosenvold

Paul Tomblin

Recent Activity

Donate For Us