Using the python built-in profiler with a script runninng in one processor (and no multithreading)
time python -m cProfile myscript.py
the CPU time reported by the profiler is 345.710 CPU seconds
24184348 function calls (24183732 primitive calls) in 345.710 CPU seconds
and the real, user and sys time are:
real 5m45.926s
user 1m59.340s
sys 0m39.452s
As you can see the CPU time is nearly the real time (345.710 = 5m45.710s).
So, given that result, is it possible to assume that CPU time reported by the profiler includes the time slices used by other processes and time the process spends blocked? i.e. that the profiler CPU time is not the process time (user+sys) but the wall clock time as it is explained in What do 'real', 'user' and 'sys' mean in the output of time(1)?
Thank very much in advance
Wall clock time vs. CPU time. Wall clock time measures how much time has passed, as if you were looking at the clock on your wall. CPU time is how many seconds the CPU was busy.
Analysis of the profiler data is done using the Stats class. This class constructor creates an instance of a “statistics object” from a filename (or list of filenames) or from a Profile instance.
Profiling is a technique to figure out how time is spent in a program. With these statistics, we can find the “hot spot” of a program and think about ways of improvement. Sometimes, a hot spot in an unexpected location may hint at a bug in the program as well.
Python includes a built-in module called cProfile which is used to measure the execution time of a program. The cProfiler module provides all information about how long the program is executing and how many times the function gets called in a program.
This answers beautifully details on the meaning of real, user and sys timings. To quote:
'Real' is wall clock time - time from start to finish of the call. This is all elapsed time including time slices used by other processes and time the process spends blocked (for example if it is waiting for I/O to complete).
'User' is the amount of CPU time spent in user-mode code (outside the kernel) within the process. This is only actual CPU time used in executing the process. Other processes and time the process spends blocked do not count towards this figure.
'Sys' is the amount of CPU time spent in the kernel within the process. This means executing CPU time spent in system calls within the kernel, as opposed to library code, which is still running in user-space. Like 'user', this is only CPU time used by the process.
From the above explanation it appears like User+Sys time should be equal to CPU seconds. Instead, its closer to 'real' time. Strange!
There is a fair explanation for that. 'User' time does not include CPU seconds spent on I/O operations within the process. It just measures CPU time spent on user-mode code within the memory. The rule of thumb is:
real time = user + sys + I/O time + interpreter startup time + bytecode compilation time
To validate this I made a urllib2.urlopen(urllib2.Request(url))
call doing intensive I/O. Here are the results:
100792 function calls (98867 primitive calls) in 2.076 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
512 0.719 0.001 0.719 0.001 {method 'recv' of '_socket.socket' objects}
1 0.464 0.464 0.473 0.473 {_socket.getaddrinfo}
1 0.352 0.352 0.352 0.352 <string>:1(connect)
1 0.248 0.248 0.348 0.348 common_functions.py:1(<module>)
1 0.022 0.022 0.075 0.075 __init__.py:2(<module>)
1 0.017 0.017 0.030 0.030 urllib.py:1442(getproxies_macosx_sysconf)
330/67 0.012 0.000 0.031 0.000 sre_parse.py:385(_parse)
real 0m2.255s
user 0m0.352s
sys 0m0.093s
Here, 2.076-(0.352+0.093), i.e, 1.631 CPU secs were consumed in I/O operations (mainly _socket.socket
and _socket.getaddrinfo
). Rest of the time, 2.255-2.076, were spent in cold start of the code.
Hope that was useful.
Update: In multiple core systems, where multiple CPUs work in parallel, the case is slightly different. Total CPU seconds reported by cProfile is the sum total of time spent by all CPUs individually. For eg: In a 2 core system, if one CPU works for 10 secs. In parallel, another CPU works for 15 secs. Total CPU seconds reported will be 25 secs. Although the real time elapsed may be just 15 secs. Hence CPU time may be more than Real Time in multi core systems. As the CPUs are working in parallel
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With