I have a simple neural network model and I apply either <code>cuda()</code> or <code>DataParallel()</code> on the model like following. <pre class="prettyprint"><code>model = torch.nn.DataParallel(model).cuda() </code></pre> OR, <pre class="prettyprint"><code>model = model.cuda() </code></pre> When I don't use DataParallel, rather simply transform my model to <code>cuda()</code>, I need to explicitly convert the batch inputs to <code>cuda()</code> and then give it to the model, otherwise it returns the following error. <blockquote> torch.index_select received an invalid combination of arguments - got (torch.cuda.FloatTensor, int, torch.LongTensor) </blockquote> But with DataParallel, the code works fine. Rest of the other things are same. Why this happens? Why when I use DataParallel, I don't need to transform the batch inputs explicitly to <code>cuda()</code>?

Because, DataParallel allows CPU inputs, as it's first step is to transfer inputs to appropriate GPUs. Info source: https://discuss.pytorch.org/t/cuda-vs-dataparallel-why-the-difference/4062/3

CUDA vs. DataParallel: Why the difference?

Tags:

pytorch

I have a simple neural network model and I apply either cuda() or DataParallel() on the model like following.

model = torch.nn.DataParallel(model).cuda()

OR,

model = model.cuda()

When I don't use DataParallel, rather simply transform my model to cuda(), I need to explicitly convert the batch inputs to cuda() and then give it to the model, otherwise it returns the following error.

torch.index_select received an invalid combination of arguments - got (torch.cuda.FloatTensor, int, torch.LongTensor)

But with DataParallel, the code works fine. Rest of the other things are same. Why this happens? Why when I use DataParallel, I don't need to transform the batch inputs explicitly to cuda()?

908

asked Jun 16 '17 03:06

Wasi Ahmad

1 Answers

Because, DataParallel allows CPU inputs, as it's first step is to transfer inputs to appropriate GPUs.

Info source: https://discuss.pytorch.org/t/cuda-vs-dataparallel-why-the-difference/4062/3

117

answered Nov 15 '22 09:11

Wasi Ahmad

Related questions
                            
                                How to find IoU from segmentation masks?
                            
                                pip - Installing specific package version does not work
                            
                                What is the equivalent of torch.nn.functional.grid_sample in Tensorflow / Numpy?
                            
                                How does pytorch calculate matrix pairwise distance? Why isn't 'self' distance not zero?
                            
                                What's the fastest way to copy values from one tensor to another in PyTorch?
                            
                                Why do we pass nn.Module as an argument to class definition for neural nets?
                            
                                How to install older version of pytorch
                            
                                How to multiply a dense matrix by a sparse matrix element-wise in pytorch
                            
                                Model() got multiple values for argument 'nr_class' - SpaCy multi-classification model (BERT integration)
                            
                                Loading .npy files as dataset for pytorch
                            
                                Running PyTorch multiprocessing in a Docker container with Gunicorn worker manager
                            
                                How to mask a 3D tensor with 2D mask and keep the dimensions of original vector?
                            
                                Truncated backpropagation in PyTorch (code check)
                            
                                Reset parameters of a neural network in pytorch
                            
                                How to get only specific classes from PyTorch's FashionMNIST dataset?
                            
                                Can you accelerate torch DL training on anything other than "cuda" like "hip" or "OpenCL"?
                            
                                RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces)
                            
                                In Pytorch, is there a difference between (x<0) and x.lt(0)?
                            
                                How to asynchronously load and train batches to train a DeepLearning model?
                            
                                Cannot convert list to array: ValueError: only one element tensors can be converted to Python scalars

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

CUDA vs. DataParallel: Why the difference?

Tags:

pytorch

Wasi Ahmad

People also ask

1 Answers

Wasi Ahmad

Recent Activity

Donate For Us