Often during my work I write code to read lines from a file and I process those lines one at a time.
Sometimes the line processing is complicated and the file is long, for example, today it takes roughly a minute for processing 200 lines and the total lines in the file are 175k.
I want to figure out which part of my code is taking a long time and for that I decided to use the cProfiler in Python.
The problem is that I can't actually run the whole code because that would take too long, and if I interrupt the process midway through an exit signal then I cProfiler also dies without producing a report and modifying code with logic to die after a certain reading only top K lines is annoying (because I tend to this kind of thing a lot for different types of data in my job.) I want to avoid adding options only for the sake of profiling if possible.
What would be the cleanest way to tell cProfiler to run for 3 minutes, profile what happens, stop and then report its findings?
The syntax is cProfile. run(statement, filename=None, sort=-1) . You can pass python code or a function name that you want to profile as a string to the statement argument. If you want to save the output in a file, it can be passed to the filename argument.
Introduction to the profilers cProfile and profile provide deterministic profiling of Python programs. A profile is a set of statistics that describes how often and for how long various parts of the program executed. These statistics can be formatted into reports via the pstats module.
Profiling is a technique to figure out how time is spent in a program. With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may hint at a bug in the program as well.
Step 1: run your script myscript.py
under the profiler for 3 minutes, outputting the profiling information to the file prof
. On Linux and similar, you can do this with
timeout -s INT 3m python -m cProfile -o prof myscript.py
(Note: if you omit -s INT
, SIGTERM is used instead of SIGINT, which seems to work of Python 2 but not on Python 3.) Alternatively, on any system, you should be able to do
python -m cProfile -o prof myscript.py
then press Ctrl-C at the end of 3 minutes.
Step 2: get some statistics from the prof
file with something like
python -c "import pstats; pstats.Stats('prof').sort_stats('time').print_stats(20)"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With