Why are Python operations 30× slower after calling time.sleep or subprocess.Popen?

Tags:

Consider the following loop:

for i in range(20):
    if i == 10:
        subprocess.Popen(["echo"]) # command 1
    t_start = time.time()
    1+1 # command 2
    t_stop = time.time()
    print(t_stop - t_start)

“command 2” command takes systematically longer to run when “command 1” is run before it. The following plot shows the execution time of 1+1 as a function of the loop index i, averaged over 100 runs.

Execution of 1+1 is 30 times slower when preceded by subprocess.Popen.

Execution time of 1+1 as a function of loop index

It gets even weirder. One may think that only the first command run after subprocess.Popen() is affected, but it is not the case. The following loop shows that all commands in the current loop iteration are affected. But the subsequent loops iterations seem to be mostly OK.

var = 0
for i in range(20):
    if i == 10:
      # command 1
      subprocess.Popen(['echo'])
    # command 2a
    t_start = time.time()
    1 + 1
    t_stop = time.time()
    print(t_stop - t_start)
    # command 2b
    t_start = time.time()
    print(1)
    t_stop = time.time()
    print(t_stop - t_start)
    # command 2c
    t_start = time.time()
    var += 1
    t_stop = time.time()
    print(t_stop - t_start)

Here’s a plot of the execution times for this loop, average over 100 runs:

Execution times of 1+1, print(1), and var += 1 as a function of loop index

More remarks:

We get the same effect when replacing subprocess.Popen() (“command 1”) with time.sleep(), or rawkit’s libraw C++ bindings initialization (libraw.bindings.LibRaw()). However, using other libraries with C++ bindings such as libraw.py, or OpenCV’s cv2.warpAffine() do not affect execution times. Opening files don’t either.
The effect is not caused by time.time(), because it is visible with timeit.timeit(), and even by measuring manually when print() result appear.
It also happens without a for-loop.
This happens even when a lot of different (possibly CPU- and memory-consuming) operations are performed between “command 1” (subprocess.Popen) and “command 2”.
With Numpy arrays, the slowdown appears to be proportional to the size of the array. With relatively big arrays (~ 60 M points), a simple arr += 1 operation can take up to 300 ms!

Question: What may cause this effect, and why does it affect only the current loop iteration?

I suspect that it could be related to context switching, but this doesn’t seem to explain why a whole loop iteration would affected. If context switching is indeed the cause, why do some commands trigger it while others don’t?

806

asked Nov 09 '18 14:11

Arcturus B

1 Answers

my guess would be that this is due to the Python code being evicted from various caches in the CPU/memory system

the perflib package can be used to extract more detailed CPU level stats about the state of the cache — i.e. the number of hits/misses.

I get ~5 times the LIBPERF_COUNT_HW_CACHE_MISSES counter after the Popen() call:

from subprocess import Popen, DEVNULL
from perflib import PerfCounter
import numpy as np

arr = []
p = PerfCounter('LIBPERF_COUNT_HW_CACHE_MISSES')                                                        

for i in range(100):
  ti = []
  p.reset()
  p.start()
  ti.extend(p.getval() for _ in range(7))
  Popen(['echo'], stdout=DEVNULL)
  ti.extend(p.getval() for _ in range(7))
  p.stop()
  arr.append(ti)


np.diff(np.array(arr), axis=1).mean(axis=0).astype(int).tolist()

gives me:

 2605,  2185,  2127,  2099,  2407,  2120,
5481210,
16499, 10694, 10398, 10301, 10206, 10166

(lines broken in non-standard places to indicate code flow)

130

answered Oct 23 '22 17:10

Sam Mason

Related questions
                            
                                django - inline - Search for existing record instead of adding a new one
                            
                                Models inside tests - Django 1.7 issue
                            
                                How to return str from MySQL using mysql.connector?
                            
                                Pandas DataFrame: complete spec for __getitem__()? [closed]
                            
                                Accessing JVM from python
                            
                                Why does this Jython loop fail after a single run?
                            
                                Headless Selenium + Xvfb + Chrome on OSX 10.11
                            
                                Disable special "class" attribute handling
                            
                                Sending JSON data over WebSocket from Matlab using Python Twisted and Autobahn
                            
                                Django REST Framework (ModelViewSet), 405 METHOD NOT ALLOWED
                            
                                JSON formatted logging with Flask and gunicorn
                            
                                completely self-contained virtual environment
                            
                                Nuitka error Cannot find ' ' in package ' ' as absolute import
                            
                                pandas get_level_values for multiple columns
                            
                                PySpark: StructField(..., ..., False) always returns `nullable=true` instead of `nullable=false`
                            
                                Python Jupyter: Shortcut to copy output of a cell
                            
                                How can we build and distribute python scripts in a windows environment
                            
                                How to keep tensorflow session open between predictions? Loading from SavedModel
                            
                                How to save a custom transformer in sklearn?
                            
                                Tensorflow CNN training images are all different sizes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why are Python operations 30× slower after calling time.sleep or subprocess.Popen?

Tags:

performance

python

subprocess

Arcturus B

People also ask

1 Answers

Sam Mason

Recent Activity

Donate For Us