I am working on this model:
class Model(torch.nn.Module):
def __init__(self, sizes, config):
super(Model, self).__init__()
self.lstm = []
for i in range(len(sizes) - 2):
self.lstm.append(LSTM(sizes[i], sizes[i+1], num_layers=8))
self.lstm.append(torch.nn.Linear(sizes[-2], sizes[-1]).cuda())
self.lstm = torch.nn.ModuleList(self.lstm)
self.config_mel = config.mel_features
def forward(self, x):
# convert to log-domain
x = x.clip(min=1e-6).log10()
for layer in self.lstm[:-1]:
x, _ = layer(x)
x = torch.relu(x)
#x = torch_unpack_seq(x)[0]
x = self.lstm[-1](x)
mask = torch.sigmoid(x)
return mask
and then:
model = Model(model_width, config)
model.cuda()
But I am getting this error:
File "main.py", line 29, in <module>
Model.train(args)
File ".../src/model.py", line 57, in train
model.cuda()
File ".../.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 637, in cuda
return self._apply(lambda t: t.cuda(device))
File ".../.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File "/.../.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 530, in _apply
module._apply(fn)
File ".../.local/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 189, in _apply
self.flatten_parameters()
File ".../.local/lib/python3.8/site-packages/torch/nn/modules/rnn.py", line 175, in flatten_parameters
torch._cudnn_rnn_flatten_weight(
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
I have no idea why it is happening. I am trying to push model and the inputs in cuda, and I understand if the error was due to some models in CPU and some in GPU. But that is not the case here. I found some pip install solution here: Pytorch CUDA error: no kernel image is available for execution on the device on RTX 3090 with cuda 11.1
but I cannot use it as I am trying to do the work in a remote repo where I don't have access to pip install.
Is there a way I can solve this?
I checked the latest torch and torchvision version with cuda from the given link. Stable versions list: https://download.pytorch.org/whl/cu113/torch_stable.html
Below versions solved the error,
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 -f https://download.pytorch.org/whl/torch_stable.html
Reference: #49161
talonmies comment really helped:
The PyTorch installation you are trying to use doesn't have built-in binary support for the GPU you are trying to use. You will have to find (or make yourself) a build which has built in support. There is no work around here because of the design and packaging of PyTorch
The torch version was not compatible with the cuda version. I could check the issue in details with CUDA_LAUNCH_BLOCKING=1. I uninstalled the previous cuda version and installed the one I actually needed and now it's working
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With