RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle)` with GPU only

Question

I'm working on the CNN with one-dimensional signal. It works totally fine with CPU device. However, when I training model in GPU, CUDA error occurred. I set os.environ['CUDA_LAUNCH_BLOCKING'] = "1" command after I got RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle). With doing this, a cublasSgemm error occurred instead of cublasCreate error. Though the nvidia document doubt the hardware problem, I can training other CNN with images without any error. Below is my code for the data loading and set data in training model.

    idx = np.arange(len(dataset))  # dataset & label shuffle in once
    np.random.shuffle(idx)

    dataset = dataset[idx]
    sdnn = np.array(sdnn)[idx.astype(int)]        

    train_data, val_data = dataset[:int(0.8 * len(dataset))], dataset[int(0.8 * len(dataset)):]
    train_label, val_label = sdnn[:int(0.8 * len(sdnn))], sdnn[int(0.8 * len(sdnn)):]
    train_set = DataLoader(dataset=train_data, batch_size=opt.batch_size, num_workers=opt.workers)

    for i, data in enumerate(train_set, 0):  # data.shape = [batch_size, 3000(len(signal)), 1(channel)] tensor

        x = data.transpose(1, 2)
        label = torch.Tensor(train_label[i * opt.batch_size:i * opt.batch_size + opt.batch_size])
        x = x.to(device, non_blocking=True)
        label = label.to(device, non_blocking=True) # [batch size]
        label = label.view([len(label), 1])
        optim.zero_grad()

        # Feature of signal extract
        y_predict = model(x) # [batch size, fc3 output] # Error occurred HERE
        loss = mse(y_predict, label)

Below is the error message from this code.

File C:/Users/Me/Desktop/Me/Study/Project/Analysis/Regression/main.py", line 217, in Processing
    y_predict = model(x) # [batch size, fc3 output]
  File "C:\Anaconda\envs	orch\lib\site-packages	orch
n\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Users\ME\Desktop\ME\Study\Project\Analysis\Regression\cnn.py", line 104, in forward
    x = self.fc1(x)
  File "C:\Anaconda\envs	orch\lib\site-packages	orch
n\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "C:\Anaconda\envs	orch\lib\site-packages	orch
n\modules\linear.py", line 91, in forward
    return F.linear(input, self.weight, self.bias)
  File "C:\Anaconda\envs	orch\lib\site-packages	orch
n\functional.py", line 1674, in linear
    ret = torch.addmm(bias, input, weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

I've tried to solve this error for weeks but can't find the solution. If you can see anything wrong here, please let me know.

Loich · Accepted Answer

Please know that, it can also be caused if you have a mismatch between the dimension of your input tensor and the dimensions of your nn.Linear module. (ex. input.shape = (a, b) and nn.Linear(c, c, bias=False) with c not matching).

Young.J · Answer

With searched with the partial keywords, I finally got the similar situation. Because of the stability, I used the CUDA 10.2 version. The reference asked to upgrade CUDA toolkit to higher - 11.2 in my case - and problem solved! I've deal with other training processes but this one only caused error. As the CUDA error occurred with various reasons, changes the version could be counted for solution.

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle)` with GPU only

Tags:

python

gpu

pytorch

conv-neural-network

Young.J

2 Answers

Loich

Young.J

Recent Activity

Donate For Us

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle)` with GPU only

Tags:

python

gpu

pytorch

conv-neural-network

Young.J

2 Answers

Loich

Young.J

Related questions

Recent Activity

Donate For Us