Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I choose between Pipe, Queue, Value, Array and Manager to communicate between processes?

Tags:

python

process

In python, it supplies a lot of ways to communicate between processes in module multiprocessing, Pipe, Queue, Value, Array and Manager. Which of them are better choices?

like image 874
Shady Xu Avatar asked Jun 06 '13 06:06

Shady Xu


People also ask

How do you communicate two processes in Python?

Every object has two methods – send() and recv(), to communicate between processes.

How do I share data between two processes in Python?

Passing Messages to Processes A simple way to communicate between process with multiprocessing is to use a Queue to pass messages back and forth. Any pickle-able object can pass through a Queue. This short example only passes a single message to a single worker, then the main process waits for the worker to finish.

Which is the method used to change the default way to create child processes in multiprocessing?

The possible start methods are 'fork', 'spawn' and 'forkserver'. On Windows only 'spawn' is available. On Unix 'fork' and 'spawn' are always supported, with 'fork' being the default.

How many processes should be running Python multiprocessing?

If we are using the context manager to create the process pool so that it is automatically shutdown, then you can configure the number of processes in the same manner. The number of workers must be less than or equal to 61 if Windows is your operating system.


1 Answers

Use Pipe and Queues if you want to implement message-passing. Use Value and Array if you want to implement shared memeory. Use Managers if you want to expose an object-oriented interface to multiple processes.

Pipe is good for 1-to-1 communication or for byte-level protocols:

  • Pipe can be either bidirectional (“duplex”) or unidirectional
  • Pipe is not concurrency safe: only one process can use the same end of the Pipe; it is good in 1-to-1 communications, otherwise it requires locks
  • Pipe may transmit pickable Python objects or raw bytes
  • The buffer size cannot be specified

Queue is similar to a unidirectional Pipe, but may work in many-to-many scenarios:

  • Queue is unidirectional: first in, first out
  • Queue is concurrency safe: multiple processes may use the same end of the Queue; so it is good when there are multiple producers or multiple consumers
  • Queue may transmit only pickable Python objects
  • The maximum size of the Queue may be specified
  • "Whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined."

Queue is implemented using a pipe and some locks/semaphores.

Value and Array:

  • provide synchronized access to shared data
  • work only with C data types (ctypes)

Managers:

  • may be used on various data types: they return proxy objects which expose the same methods as the underlying (shared) object
  • can handle remote access

Value and Array is a lightweight approach to shared memory. In my experience, the overhead of using a SyncManager and AutoProxy can be huge. If you can solve your problem using a Value or an Array, use them. SyncManager may be useful to expose an object-oriented interface to multiple processes, unless it is not called too frequently.

like image 179
sastanin Avatar answered Nov 15 '22 16:11

sastanin