Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

I installed tensorflow 1.0.1 GPU version on my Macbook Pro with GeForce GT 750M. Also installed CUDA 8.0.71 and cuDNN 5.1. I am running a tf code that works fine with non CPU tensorflow but on GPU version, I get this error (once a while it works too):

name: GeForce GT 750M major: 3 minor: 0 memoryClockRate (GHz) 0.9255 pciBusID 0000:01:00.0 Total memory: 2.00GiB Free memory: 67.48MiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0  I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0:   Y  I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0) E tensorflow/stream_executor/cuda/cuda_driver.cc:1002] failed to allocate 67.48M (70754304 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY Training...  E tensorflow/stream_executor/cuda/cuda_dnn.cc:397] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR E tensorflow/stream_executor/cuda/cuda_dnn.cc:364] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM F tensorflow/core/kernels/conv_ops.cc:605] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)  Abort trap: 6 

What is happening here? Is this a bug in tensorflow. Please help.

Here are GPU memory space when I run the python code:

Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 83.477 of 2047.6 MB (i.e. 4.08%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 83.477 of 2047.6 MB (i.e. 4.08%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 83.477 of 2047.6 MB (i.e. 4.08%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 1.1016 of 2047.6 MB (i.e. 0.0538%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 1.1016 of 2047.6 MB (i.e. 0.0538%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 1.1016 of 2047.6 MB (i.e. 0.0538%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 1.1016 of 2047.6 MB (i.e. 0.0538%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 91.477 of 2047.6 MB (i.e. 4.47%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 22.852 of 2047.6 MB (i.e. 1.12%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 22.852 of 2047.6 MB (i.e. 1.12%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 36.121 of 2047.6 MB (i.e. 1.76%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 71.477 of 2047.6 MB (i.e. 3.49%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 67.477 of 2047.6 MB (i.e. 3.3%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 67.477 of 2047.6 MB (i.e. 3.3%) Free MacBook-Pro:cuda-smi-master xxxxxx$ ./cuda-smi Device 0 [PCIe 0:1:0.0]: GeForce GT 750M (CC 3.0): 67.477 of 2047.6 MB (i.e. 3.3%) Free 
like image 324
Shimano Avatar asked Mar 31 '17 19:03

Shimano


2 Answers

In Tensorflow 2.0, my issue was resolved by setting the memory growth. ConfigProto is deprecated in TF 2.0, I used tf.config.experimental. My computer specs are:

  • OS: Ubuntu 18.04
  • GPU: GeForce RTX 2070
  • Nvidia Driver: 430.26
  • Tensorflow: 2.0
  • Cudnn: 7.6.2
  • Cuda: 10.0

The code I used was:

physical_devices = tf.config.experimental.list_physical_devices('GPU') assert len(physical_devices) > 0, "Not enough GPU hardware devices available" config = tf.config.experimental.set_memory_growth(physical_devices[0], True) 
like image 150
aveevu Avatar answered Sep 17 '22 02:09

aveevu


I have managed to get it working by deleting the .nv folder in my home folder:

sudo rm -rf ~/.nv/ 
like image 29
Félix Fu Avatar answered Sep 18 '22 02:09

Félix Fu