Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whats are best practices and tools for profiling and performance testing python code? [duplicate]

Possible Duplicate:
How to profile my code?

Whats are best practices and tools for profiling and performance testing python code? Any quick wins here or recommendations.

CProfile seams popular and some great notes/answers below, both are very good answers/tutorials. Vote away and I'll pick the top one in a day or two. Thanks @senderle and @campos.ddc

Once a problem area is found are there any idioms and/or tips for converting code to make it faster?

like image 215
Matt Alcock Avatar asked Feb 23 '12 00:02

Matt Alcock


People also ask

What are profiling tools in performance testing?

Performance profilers are software development tools designed to help you analyze the performance of your applications and improve poorly performing sections of code.

What are profiling tools?

A profiling tool is important for performing analysis of the source and target data structures for data integration, whether the transformation will be performed in a batch or real-time environment.

Can Python be used for performance testing?

A lot of the articles in this series take advantage of a feature of Python which allows us to performance test our code, and I finally wanted to get around to explaining how it works and how to use it. because I find it easy to understand, but you may find the various profiling tools helpful.


1 Answers

cProfile is the classic profiling tool. The basic way to use it is like so:

python -m cProfile myscript.py

Here I've called it on the test routine of a reference implementation of the mersenne twister that I wrote.

me@mine $ python -m cProfile mersenne.twister.py 
True
True
1000000
         1003236 function calls in 2.163 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.163    2.163 <string>:1(<module>)
        1    0.001    0.001    2.162    2.162 mersenne.twister.py:1(<module>)
        3    0.001    0.000    0.001    0.000 mersenne.twister.py:10(init_gen)
  1000014    1.039    0.000    1.821    0.000 mersenne.twister.py:19(extract_number)
        1    0.000    0.000    0.000    0.000 mersenne.twister.py:3(Twister)
     1603    0.766    0.000    0.782    0.000 mersenne.twister.py:33(generate_numbers)
        1    0.000    0.000    0.000    0.000 mersenne.twister.py:4(__init__)
        1    0.317    0.317    2.161    2.161 mersenne.twister.py:42(_test)
        1    0.001    0.001    2.163    2.163 {execfile}
        1    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
     1608    0.038    0.000    0.038    0.000 {range}

ncalls is the number of times a function was called. tottime is the total time spent in a function, excluding the time spent in sub-function calls. percall is tottime / ncalls. cumtime is the time spent in the function including the time spent in sub-function calls. And the remaining data is as follows: filename:lineno(func_name).

In most cases, look at ncalls and tottime first. In the above data, you can see that the large majority of the time spent by this program happens in extract_number. Furthermore, we can see that extract_number is called many (1000014) times. So anything I can do to speed up extract_number will significantly speed up the execution of this test code. If it gains me a microsecond, then the gain will be multiplied by 1000014, resulting in a full second gain.

Then I should work on generate_numbers. Gains there won't matter as much, but they may still be significant, and since that function burns another .7 seconds, there's some benefit to be had.

That should give you the general idea. Note, however, that the tottime number can sometimes be deceptive, in cases of recursion, for example.

like image 180
senderle Avatar answered Sep 20 '22 11:09

senderle