I have a simple task like that:
def worker(queue): while True: try: _ = queue.get_nowait() except Queue.Empty: break if __name__ == '__main__': manager = multiprocessing.Manager() # queue = multiprocessing.Queue() queue = manager.Queue() for i in range(5): queue.put(i) processes = [] for i in range(2): proc = multiprocessing.Process(target=worker, args=(queue,)) processes.append(proc) proc.start() for proc in processes: proc.join()
It seems that multiprocessing.Queue can do all work that i needed, but on the other hand I see many examples of manager().Queue() and can't understand what I really need. Looks like Manager().Queue() use some sort of proxy objects, but I doesn't understand those purpose, because multiprocessing.Queue() do the same work without any proxy objects.
So, my questions is:
1) What really difference between multiprocessing.Queue and object returned by multiprocessing.manager().Queue()?
2) What do I need to use?
Python Multiprocessing modules provides Queue class that is exactly a First-In-First-Out data structure. They can store any pickle Python object (though simple ones are best) and are extremely useful for sharing data between processes.
multiprocessing is a package that supports spawning processes using an API similar to the threading module.
Yes, it is. From https://docs.python.org/3/library/multiprocessing.html#exchanging-objects-between-processes: Queues are thread and process safe.
In other words, Multiprocess queue is pretty slow putting and getting individual data, then QuickQueue wrap several data in one list, this list is one single data that is enqueue in the queue than is more quickly than put one individual data.
Though my understanding is limited about this subject, from what I did I can tell there is one main difference between multiprocessing.Queue() and multiprocessing.Manager().Queue():
Note When an object is put on a queue, the object is pickled and a background thread later flushes the pickled data to an underlying pipe. This has some consequences which are a little surprising, but should not cause any practical difficulties – if they really bother you then you can instead use a queue created with a manager. After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising Queue.Empty. If multiple processes are enqueuing objects, it is possible for the objects to be received at the other end out-of-order. However, objects enqueued by the same process will always be in the expected order with respect to each other.
Warning As mentioned above, if a child process has put items on a queue (and it has not used JoinableQueue.cancel_join_thread), then that process will not terminate until all buffered items have been flushed to the pipe. This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children. Note that a queue created using a manager does not have this issue.
There is a workaround to use multiprocessing.Queue() with Pool by setting the queue as a global variable and setting it for all processes at initialization :
queue = multiprocessing.Queue() def initialize_shared(q): global queue queue=q pool= Pool(nb_process,initializer=initialize_shared, initargs(queue,))
will create pool processes with correctly shared queues but we can argue that the multiprocessing.Queue() objects were not created for this use.
On the other hand the manager.Queue() can be shared between pool subprocesses by passing it as normal argument of a function.
In my opinion, using multiprocessing.Manager().Queue() is fine in every case and less troublesome. There might be some drawbacks using a manager but I'm not aware of it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With