Dump intermediate results of multiprocessing job to filesystem and continue with processing later on

1 Answers

Perhaps use pickle. Read more here:

https://docs.python.org/3/library/pickle.html

Based on aws_apprentice's comment I created a full multiprocessing example in case you weren't sure how to use intermediate results. The first time this is run it will print "None" as there are no intermediate results. Run it again to simulate restarting.

from multiprocessing import Process
import pickle

def proc(name):
  data = None

  # Load intermediate results if they exist
  try:
    f = open(name+'.pkl', 'rb')
    data = pickle.load(f)
    f.close()
  except:
    pass

  # Do something
  print(data)
  data = "intermediate result for " + name

  # Periodically save your intermediate results
  f = open(name+'.pkl', 'wb')
  pickle.dump(data, f, -1)
  f.close()

processes = []
for x in range(5):
  p = Process(target=proc, args=("proc"+str(x),))
  p.daemon = True
  p.start()
  processes.append(p)

for process in processes:
  process.join()

for process in processes:
  process.terminate()

You can also use json if that makes sense to output intermediate results in human readable format. Or sqlite as a database if you need to push data into rows.

answered Oct 21 '22 03:10

MarkReedZ

Related questions
                            
                                Not being able to detect '-' character in regular expression [duplicate]
                            
                                Can't locate a python script from error message
                            
                                Problem with ERR_TOO_MANY_REDIRECTS django 2.1
                            
                                Keyboard event not sent to window with pywin32
                            
                                How to differentiate between default value and user given value?
                            
                                HTML structure into network graph
                            
                                Grouping and pivoting DataFrame with additional column for ratio of counts
                            
                                What does conda env do under the hood?
                            
                                Cosine similarity for very large dataset
                            
                                Trying to find a large string between a start point and end point using regex
                            
                                Python automatically converting some strings to raw strings?
                            
                                Why does asyncio subprocess.communicate hang when called in different thread?
                            
                                Using Numpy to solve Linear Equations involving modulo operation
                            
                                How to move installed packages to a newly created virtual environment ?
                            
                                How to tell if the next line should be indented when parsing python
                            
                                All combinations of set of dictionaries into K N-sized groups
                            
                                What would be Promise.race equivalent in Python asynchronous code?
                            
                                TypeError: list indices must be integers, not dict
                            
                                How to create request body for Python Elasticsearch mSearch
                            
                                pylint warning on 'except Exception:'

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Dump intermediate results of multiprocessing job to filesystem and continue with processing later on

Tags:

python

python-multiprocessing

user7468395

People also ask

1 Answers

MarkReedZ

Recent Activity

Donate For Us