Python: running subprocess in parallel [duplicate]

Tags:

subprocess

I have the following code that writes the md5sums to a logfile

for file in files_output:     p=subprocess.Popen(['md5sum',file],stdout=logfile) p.wait()

Will these be written in parallel? i.e. if md5sum takes a long time for one of the files, will another one be started before waiting for a previous one to complete?
If the answer to the above is yes, can I assume the order of the md5sums written to logfile may differ based upon how long md5sum takes for each file? (some files can be huge, some small)

650

asked May 08 '13 21:05

1 Answers

Yes, these md5sum processes will be started in parallel.
Yes, the order of md5sums writes will be unpredictable. And generally it is considered a bad practice to share a single resource like file from many processes this way.

Also your way of making p.wait() after the for loop will wait just for the last of md5sum processes to finish and the rest of them might still be running.

But you can modify this code slightly to still have benefits of parallel processing and predictability of synchronized output if you collect the md5sum output into temporary files and collect it back into one file once all processes are done.

import subprocess import os  processes = [] for file in files_output:     f = os.tmpfile()     p = subprocess.Popen(['md5sum',file],stdout=f)     processes.append((p, f))  for p, f in processes:     p.wait()     f.seek(0)     logfile.write(f.read())     f.close()

answered Sep 23 '22 15:09

dkz

Related questions
                            
                                Python script scheduling in airflow
                            
                                how to store numpy arrays as tfrecord?
                            
                                Fastest parallel requests in Python
                            
                                A QuerySet by aggregate field value
                            
                                Converting a Python Float to a String without losing precision
                            
                                How to use Python to programmatically generate part of Sphinx documentation?
                            
                                Plotting directed graphs in Python in a way that show all edges separately
                            
                                Use Python Selenium to get span text
                            
                                Python SVG parser
                            
                                How to make a shallow copy of a list in Python
                            
                                Celery Logs into file
                            
                                Python: Argument Parsing Validation Best Practices
                            
                                Django: timezone.now vs timezone.now()
                            
                                Python - object MagicMock can't be used in 'await' expression
                            
                                Could not install packages due to an EnvironmentError: [Errno 28] No space left on device
                            
                                seriously simple python HTTP proxy? [duplicate]
                            
                                Python Image Library: How to combine 4 images into a 2 x 2 grid?
                            
                                Get cookie from CookieJar by name
                            
                                Python http.client json request and response. How?
                            
                                python plot simple histogram given binned data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: running subprocess in parallel [duplicate]

Tags:

python

subprocess

imagineerThat

People also ask

1 Answers

dkz

Recent Activity

Donate For Us