I am programming with PyTorch multiprocessing. I want all the subprocesses can read/write the same list of tensors (no resize). For example the variable can be <pre class="prettyprint"><code>m = list(torch.randn(3), torch.randn(5)) </code></pre> Because each tensor has different sizes, I cannot organize them into a single tensor. A python list has no share_memory_() function, and multiprocessing.Manager cannot handle a list of tensors. How can I share the variable m among multiple subprocesses?

The original answer given by @rozyang does not work with GPUs. It raises error like <code>RuntimeError: CUDA error: initialization error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.</code> To fix it, add <code>mp.set_start_method('spawn', force=True)</code> to codes. The following is a snippet: <pre class="prettyprint"><code>import torch.multiprocessing as mp import torch def foo(worker,tl): tl[worker] += (worker+1) * 1000 if __name__ == '__main__': mp.set_start_method('spawn', force=True) tl = [torch.randn(2, device='cuda:0'), torch.randn(3, device='cuda:0')] for t in tl: t.share_memory_() print("before mp: tl=") print(tl) p0 = mp.Process(target=foo, args=(0, tl)) p1 = mp.Process(target=foo, args=(1, tl)) p0.start() p1.start() p0.join() p1.join() print("after mp: tl=") print(tl) </code></pre>

How to share a list of tensors in PyTorch multiprocessing?

Tags:

list

multiprocessing

tensor

pytorch

sharing

I am programming with PyTorch multiprocessing. I want all the subprocesses can read/write the same list of tensors (no resize). For example the variable can be

m = list(torch.randn(3), torch.randn(5))

Because each tensor has different sizes, I cannot organize them into a single tensor.

A python list has no share_memory_() function, and multiprocessing.Manager cannot handle a list of tensors. How can I share the variable m among multiple subprocesses?

445

asked Jun 07 '18 07:06

rozyang

2 Answers

I find the solution by myself. It is pretty straightforward. Just call share_memory_() for each list elements. The list itself is not in the shared memory, but the list elements are.

Demo code

import torch.multiprocessing as mp
import torch

def foo(worker,tl):
    tl[worker] += (worker+1) * 1000

if __name__ == '__main__':
    tl = [torch.randn(2), torch.randn(3)]

    for t in tl:
        t.share_memory_()

    print("before mp: tl=")
    print(tl)

    p0 = mp.Process(target=foo, args=(0, tl))
    p1 = mp.Process(target=foo, args=(1, tl))
    p0.start()
    p1.start()
    p0.join()
    p1.join()

    print("after mp: tl=")
    print(tl)

Output

before mp: tl=
[
 1.5999
 2.2733
[torch.FloatTensor of size 2]
, 
 0.0586
 0.6377
-0.9631
[torch.FloatTensor of size 3]
]
after mp: tl=
[
 1001.5999
 1002.2733
[torch.FloatTensor of size 2]
, 
 2000.0586
 2000.6377
 1999.0370
[torch.FloatTensor of size 3]
]

144

answered Oct 22 '22 01:10

rozyang

The original answer given by @rozyang does not work with GPUs. It raises error like RuntimeError: CUDA error: initialization error CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

To fix it, add mp.set_start_method('spawn', force=True) to codes. The following is a snippet:

import torch.multiprocessing as mp
import torch

def foo(worker,tl):
    tl[worker] += (worker+1) * 1000

if __name__ == '__main__':
    mp.set_start_method('spawn', force=True)
    tl = [torch.randn(2, device='cuda:0'), torch.randn(3, device='cuda:0')]

    for t in tl:
        t.share_memory_()

    print("before mp: tl=")
    print(tl)

    p0 = mp.Process(target=foo, args=(0, tl))
    p1 = mp.Process(target=foo, args=(1, tl))
    p0.start()
    p1.start()
    p0.join()
    p1.join()

    print("after mp: tl=")
    print(tl)

answered Oct 21 '22 23:10

Tengerye

Related questions
                            
                                Why pytorch DataLoader behaves differently on numpy array and list?
                            
                                Bind vectors across lists to single list of matrices
                            
                                Using list instead of tuple in module __all__
                            
                                Create a "view" of a Java List
                            
                                Get around raising IndexError
                            
                                Get list of contacts belonging to a specific group
                            
                                Serialize List<T> to XML, and reverse the XML to List<T>
                            
                                Lazy vs eager evaluation and double linked list building
                            
                                python list of dictionaries find duplicates based on value
                            
                                How to compare list values with dictionary keys and make a new dictionary of it using python
                            
                                Abnormal behaviour of java.util.List based on number of elements in it [duplicate]
                            
                                Python, how to write nested list with unequal lengths to a csv file?
                            
                                Generic list print method
                            
                                Add to a list in Shiny
                            
                                Converting a dictionary with lists for values into a dataframe
                            
                                Python: reduce (list of strings) -> string
                            
                                How to check if an element from List A is not present in List B in Python?
                            
                                Apply the order of list to another lists
                            
                                From DatetimeIndex to list of times
                            
                                Is there a way to cycle through indexes [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to share a list of tensors in PyTorch multiprocessing?

Tags:

list

multiprocessing

tensor

pytorch

sharing

rozyang

People also ask

2 Answers

rozyang

Tengerye

Recent Activity

Donate For Us