Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multiprocessing Programming guidelines unclear

I am trying to understand the following guideline:

Better to inherit than pickle/unpickle

When using the spawn or forkserver start methods many types from multiprocessing need to be picklable so that child processes can use them. However, one should generally avoid sending shared objects to other processes using pipes or queues. Instead you should arrange the program so that a process which needs access to a shared resource created elsewhere can inherit it from an ancestor process.

  • What does it mean to "arrange the program"?
  • How can I share resources by inheriting?

I'm running windows, so the new processes are spawned, does that means only forked processes can inherit?

like image 358
Rui Botelho Avatar asked Nov 08 '14 15:11

Rui Botelho


1 Answers

1. What does it mean to "arrange the program"?

It means that your program should be able to run as self-contained without any external resources. Sharing files will give you locking issues, sharing memory will either do the same or can give you corruption due to multiple processes modifying the data at the same time.

Here's an example of what would be a bad idea:

while some_queue_is_not_empty():
    run_external_process(some_queue)

def external_process(queue):
    item = queue.pop()
    # do processing here

Versus:

while some_queue_is_not_empty():
    item = queue.pop()
    run_external_process(item)

def external_process(item):
    # do processing here

This way you can avoid locking the queue and/or corruption issues due to multiple processes getting the same item.

2. How can I share resources by inheriting?

On Windows, you can't. On Linux you can use file descriptors that your parent opened, on Windows it will be a brand new process so you don't have anything from your parent except what was given.

Example copied from: http://rhodesmill.org/brandon/2010/python-multiprocessing-linux-windows/

from multiprocessing import Process
f = None

def child():
    print f

if __name__ == '__main__':
    f = open('mp.py', 'r')                                                      
    p = Process(target=child)
    p.start()
    p.join()

On Linux you will get something like:

$ python mp.py
<open file 'mp.py', mode 'r' at 0xb7734ac8>

On Windows you will get:

C:\Users\brandon\dev>python mp.py
None
like image 56
Wolph Avatar answered Oct 12 '22 07:10

Wolph