I have some questions about using the torch.multiprocessing
module. Let’s say I have a torch.nn.Module
called model
and I call model.share_memory()
on it.
What happens if two threads call the forward()
, i.e. model(input)
at the same time? Is it safe? Or should I use Lock mechanisms to be sure that model
is not accessed at the same time by multiple threads?
Similarly, what happens if two or more threads have an optimizer working on model.parameters()
and they call optimizer.step()
at the same time?
I ask these questions because I often see the optimizer.step()
being called on shared models without lock mechanisms (i.e. in RL implementations of A3C or ACER) and I wonder if it is a safe thing to do.
torch. multiprocessing is a drop in replacement for Python's multiprocessing module. It supports the exact same operations, but extends it, so that all tensors sent through a multiprocessing. Queue , will have their data moved into shared memory and will only send a handle to another process.
The Python language allows for something called multiprocess, a term that describes the act of running many processes simultaneously. With it, you can write a program and assign many tasks to be completed at the same time, saving time and energy.
multiprocessing is a package that supports spawning processes using an API similar to the threading module. The multiprocessing package offers both local and remote concurrency, effectively side-stepping the Global Interpreter Lock by using subprocesses instead of threads.
It doesn't have to be safe, since they are running asynchronously not in parallel. Quoting from the docs,
Using torch.multiprocessing, it is possible to train a model asynchronously, with parameters either shared all the time, or being periodically synchronized. In the first case, we recommend sending over the whole model object, while in the latter, we advise to only send the state_dict().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With