Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python multiprocessing writing to shared file

When writing to an open file that I have shared via passing it to a worker function that is implemented using multiprocessing, the files contents are not written properly. Instead '^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^' is written to the file.

Why would this happen? Can you not have many multiprocessing units writing to the same file? Do you need to use a Lock? A Queue? Am I not using Multiprocessing correctly or effectively?

I feel like some example code might help, but please just refer to it as a reference of me opening a file and passing the open file via multiprocessing to another function that does writing on that file.

Multiprocessing file:

import multiprocessing as mp

class PrepWorker():
    def worker(self, open_file):
        for i in range(1,1000000):
            data = GetDataAboutI() # This function would be in a separate file
            open_file.write(data)
            open_file.flush()
        return

if __name__ == '__main__':
    open_file = open('/data/test.csv', 'w+')
    for i in range(4):
        p = mp.Process(target=PrepWorker().worker, args=(open_file,))
        jobs.append(p)
        p.start()

    for j in jobs:
        j.join()
        print '{0}.exitcode = {1}' .format(j.name, j.exitcode)   
    open_file.close()
like image 899
ccdpowell Avatar asked Dec 29 '15 07:12

ccdpowell


1 Answers

Why would this happen?

There are several processes which possibly try to call

open_file.write(data)
open_file.flush()

at the same time. Which behavior would be fitting, in your eyes, if something like

  • a.write
  • b.write
  • a.flush
  • c.write
  • b.flush

happens?

Can you not have many multiprocessing units writing to the same file? Do you need to use a Lock? A Queue?

Python multiprocessing safely writing to a file recommends having one queue, which is the read by one process which writes to the file. So do Writing to a file with multiprocessing and Processing single file from multiple processes in python.

like image 61
serv-inc Avatar answered Sep 27 '22 20:09

serv-inc