Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unpredictable CUDNN_STATUS_NOT_INITIALIZED on Windows

I am running keras neural network training and prediction on GTX 1070 on Windows 10. Most times it is working, but from time to time it complains

E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:359] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:366] error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
E c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\stream_executor\cuda\cuda_dnn.cc:326] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
F c:\tf_jenkins\home\workspace\release-win\device\gpu\os\windows\tensorflow\core\kernels\conv_ops.cc:659] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)

It cannot be explained neither by literally error meaning nor by OOM error.

How to fix?

like image 321
Dims Avatar asked Jul 11 '17 16:07

Dims


2 Answers

Try limiting your gpu usage with set gpu option per_process_gpu_memory_fraction.

Fiddle around with it to see what works and what doesn't.

I recommend using .7 as a starting baseline.

like image 118
elf Avatar answered Oct 18 '22 20:10

elf


I met the problem sometimes on Windows10 and Keras. Reboot solve the problem for a short time, but happen again.

I refer to https://github.com/fchollet/keras/issues/1538

import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
set_session(tf.Session(config=config))

the settings solve the halt problem.

like image 3
peroon Avatar answered Oct 18 '22 19:10

peroon