Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

running several system commands in parallel in Python

Tags:

python

I write a simple script that executes a system command on a sequence of files. To speed things up, I'd like to run them in parallel, but not all at once - i need to control maximum number of simultaneously running commands. What whould be the easiest way to approach this ?

like image 206
michal Avatar asked Feb 14 '11 12:02

michal


People also ask

How do you run multiple processes in Python in parallel?

Multiprocessing in Python enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. Multiprocessing enables the computer to utilize multiple cores of a CPU to run tasks/processes in parallel. This parallelization leads to significant speedup in tasks that involve a lot of computation.

How do you run two commands simultaneously in Python?

“run multiple code at the same time python” Code Answer'sprint("Function a is running at time: " + str(int(time. time())) + " seconds.") print("Function b is running at time: " + str(int(time. time())) + " seconds.")

How do I run a task in parallel in Python?

You can use the submit() function to pass the tasks you want to be executed in parallel. The first argument is the function name (make sure not to call it), and the second one is for the URL parameter. The execution time took only 1.68 seconds, which is a significant improvement from what you had earlier.


2 Answers

If you are calling subprocesses anyway, I don't see the need to use a thread pool. A basic implementation using the subprocess module would be

import subprocess import os import time  files = <list of file names> command = "/bin/touch" processes = set() max_processes = 5  for name in files:     processes.add(subprocess.Popen([command, name]))     if len(processes) >= max_processes:         os.wait()         processes.difference_update([             p for p in processes if p.poll() is not None]) 

On Windows, os.wait() is not available (nor any other method of waiting for any child process to terminate). You can work around this by polling in certain intervals:

for name in files:     processes.add(subprocess.Popen([command, name]))     while len(processes) >= max_processes:         time.sleep(.1)         processes.difference_update([             p for p in processes if p.poll() is not None]) 

The time to sleep for depends on the expected execution time of the subprocesses.

like image 100
Sven Marnach Avatar answered Sep 22 '22 17:09

Sven Marnach


The answer from Sven Marnach is almost right, but there is a problem. If one of the last max_processes processes ends, the main program will try to start another process, and the for looping will end. This will close the main process, which can in turn close the child processes. For me, this behavior happened with the screen command.

The code in Linux will be like this (and will only work on python2.7):

import subprocess import os import time  files = <list of file names> command = "/bin/touch" processes = set() max_processes = 5  for name in files:     processes.add(subprocess.Popen([command, name]))     if len(processes) >= max_processes:         os.wait()         processes.difference_update(             [p for p in processes if p.poll() is not None]) #Check if all the child processes were closed for p in processes:     if p.poll() is None:         p.wait() 
like image 22
Thuener Avatar answered Sep 20 '22 17:09

Thuener