The profilers I have experience with (mainly the Digital Mars D profiler that comes w/ the compiler) seem to massively slow down the execution of the program being profiled. This has a major effect on my willingness to use a profiler, as it makes profiling a "real" run of a lot of my programs, as opposed to testing on a very small input, impractical. I don't know much about how profilers are implemented. Is a major (>2x) slowdown when profiling pretty much a fact of life, or are there profilers that avoid it? If it can be avoided, are there any fast profilers available for D, preferrably for D2 and preferrably for free?
Performance profilers are software development tools designed to help you analyze the performance of your applications and improve poorly performing sections of code.
They work by interrupting the application under test periodically in proportion to the consumption of the resource we're interested in. While the program is interrupted the profiler grabs a snapshot of its current state, which includes where in the code it is. After the state is captured the program continues.
The Sampling profiler helps you make a quick overall look at the application's performance. It polls the profiled application at certain intervals and determines the routine that is currently being executed. It increases the sample count for that routine and reports the number of collected samples in results.
What is code profiling? Code profiling examines the application code to ensure it is optimized, resulting in high application performance. It analyzes the memory, CPU, and network utilized by each software component or routine.
I don't know about D profilers, but in general there are two different ways a profiler can collect profiling information.
The first is by instrumentation, by injecting logging calls all over the place. This slows down the application more or less. Typically more.
The second is sampling. Then the profiler breaks the application at regular intervals and inspects the call stack. This does not slow down the application very much at all.
The downside of a sampling profiler is that the result is not as detailed as with an instrumenting profiler.
Check the documentation for your profiler if you can run with sampling instead of instrumentation. Otherwise you have some new Google terms in "sampling" and "instrumenting".
My favorite method of profiling slows the program way way down, and that's OK. I run the program under the debugger, with a realistic load, and then I manually interrupt it. Then I copy the call stack somewhere, like to Notepad. So it takes on the order of a minute to collect one sample. Then I can either resume execution, or it's even OK to start it over from the beginning to get another sample.
I do this 10 or 20 times, long enough to see what the program is actually doing from a wall-clock perspective. When I see something that shows up a lot, then I take more samples until it shows up again. Then I stop and really study what it is in the process of doing and why, which may take 10 minutes or more. That's how I find out if that activity is something I can replace with more efficient code, i.e. it wasn't totally necessary.
You see, I'm not interested in measuring how fast or slow it's going. I can do that separately with maybe only a watch. I'm interested in finding out which activities take a large percentage of time (not amount, percentage), and if something takes a large percentage of time, that is the probability that each stackshot will see it.
By "activity" I don't necessarily mean where the PC hangs out. In realistic software the PC is almost always off in a system or library routine somewhere. Typically more important is call sites in our code. If I see, for example, a string of 3 calls showing up on half of the stack samples, that represents very good hunting, because if any one of those isn't truly necessary and can be done away with, execution time will drop by half.
If you want a grinning manager, just do that once or twice.
Even in what you would think would be math-heavy scientific number crunching apps where you would think low-level optimization and hotspots would rule the day, you know what I often find? The math library routines are checking arguments, not crunching. Often the code is not doing what you think it's doing, and you don't have to run it at top speed to find that out.
I'd say yes, both sampling and instrumenting forms of profiling will tax your program heavily - regardless of whose profiler you are using, and on what language.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With