Consider the following loop:
for i in range(20):
if i == 10:
subprocess.Popen(["echo"]) # command 1
t_start = time.time()
1+1 # command 2
t_stop = time.time()
print(t_stop - t_start)
“command 2” command takes systematically longer to run when “command 1” is run before it. The following plot shows the execution time of 1+1
as a function of the loop index i
, averaged over 100 runs.
Execution of 1+1
is 30 times slower when preceded by subprocess.Popen
.
It gets even weirder. One may think that only the first command run after subprocess.Popen()
is affected, but it is not the case. The following loop shows that all commands in the current loop iteration are affected. But the subsequent loops iterations seem to be mostly OK.
var = 0
for i in range(20):
if i == 10:
# command 1
subprocess.Popen(['echo'])
# command 2a
t_start = time.time()
1 + 1
t_stop = time.time()
print(t_stop - t_start)
# command 2b
t_start = time.time()
print(1)
t_stop = time.time()
print(t_stop - t_start)
# command 2c
t_start = time.time()
var += 1
t_stop = time.time()
print(t_stop - t_start)
Here’s a plot of the execution times for this loop, average over 100 runs:
More remarks:
subprocess.Popen()
(“command 1”) with time.sleep()
, or rawkit’s libraw C++ bindings initialization (libraw.bindings.LibRaw()
). However, using other libraries with C++ bindings such as libraw.py, or OpenCV’s cv2.warpAffine()
do not affect execution times. Opening files don’t either.time.time()
, because it is visible with timeit.timeit()
, and even by measuring manually when print()
result appear.subprocess.Popen
) and “command 2”.arr += 1
operation can take up to 300 ms!Question: What may cause this effect, and why does it affect only the current loop iteration?
I suspect that it could be related to context switching, but this doesn’t seem to explain why a whole loop iteration would affected. If context switching is indeed the cause, why do some commands trigger it while others don’t?
The main difference is that subprocess. run() executes a command and waits for it to finish, while with subprocess. Popen you can continue doing your stuff while the process finishes and then just repeatedly call Popen. communicate() yourself to pass and receive data to your process.
Using Popen MethodThe Popen method does not wait to complete the process to return a value. This means that the information you get back will not be complete.
Popen is more general than subprocess. call . Popen doesn't block, allowing you to interact with the process while it's running, or continue with other things in your Python program. The call to Popen returns a Popen object.
Most of your interaction with the Python subprocess module will be via the run() function. This blocking function will start a process and wait until the new process exits before moving on.
my guess would be that this is due to the Python code being evicted from various caches in the CPU/memory system
the perflib
package can be used to extract more detailed CPU level stats about the state of the cache — i.e. the number of hits/misses.
I get ~5 times the LIBPERF_COUNT_HW_CACHE_MISSES
counter after the Popen()
call:
from subprocess import Popen, DEVNULL
from perflib import PerfCounter
import numpy as np
arr = []
p = PerfCounter('LIBPERF_COUNT_HW_CACHE_MISSES')
for i in range(100):
ti = []
p.reset()
p.start()
ti.extend(p.getval() for _ in range(7))
Popen(['echo'], stdout=DEVNULL)
ti.extend(p.getval() for _ in range(7))
p.stop()
arr.append(ti)
np.diff(np.array(arr), axis=1).mean(axis=0).astype(int).tolist()
gives me:
2605, 2185, 2127, 2099, 2407, 2120,
5481210,
16499, 10694, 10398, 10301, 10206, 10166
(lines broken in non-standard places to indicate code flow)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With