I'm doing something very simple using multiprocessing:
data = {'a': 1}
queue.put(data, True)
data.clear()
When I use the queue on another process (using get()
method), I get an empty dictionary. If I remove data.clear()
I get the keys as expected. Is there any way to wait for the put()
to have finished the serialization ?
Python Multiprocessing modules provides Queue class that is exactly a First-In-First-Out data structure. They can store any pickle Python object (though simple ones are best) and are extremely useful for sharing data between processes.
Use Pool. The multiprocessing pool starmap() function will call the target function with multiple arguments. As such it can be used instead of the map() function. This is probably the preferred approach for executing a target function in the multiprocessing pool that takes multiple arguments.
Yes, it is. From https://docs.python.org/3/library/multiprocessing.html#exchanging-objects-between-processes: Queues are thread and process safe.
multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.
Actually, this is thought to be a feature, not a problem. The queue immediately returns so your process continues while serialization happens and to avoid what is known as "queue contention".
The two options I suggest you have:
Are you absolutely sure you need mutable dictionaries in the first place? Instead of making defensive copies of your data, which you correctly seem to dislike, why not just create a new dictionary instead of using dict.clear()
and let the garbage collector worry about old dictionaries?
Pickle the data yourself; That is: a_queue.put(pickle.dumps(data))
and pickle.loads(a_queue.get())
. Now, if you do data.clear()
just after a put
, the data has already been serialized "by you".
From a parallel programming point of view the first approach (treat your data as if it were immutable) is the more viable and clean thing to do on the long term, but I am not sure if or why you must clear your dictionaries.
The best way is probably to make a copy of data
before sending it. Try:
data = {'a': 1}
dc = data.copy()
queue.put(dc)
data.clear()
Basically, you can't count on the send finishing before the dictionary is cleared, so you shouldn't try. dc
will be garbage-collected when it goes out of scope or when the code is executed again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With