Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow-gpu issue (CUDA runtime error: device kernel image is invalid)

I have a python virtual environment (conda) where I’ve installed CUDA toolkit 10.1.243 and tensorflow-gpu 2.3.0rc0. My CUDA driver is 11.0.

In order to test if tensorflow was installed to GPU correctly, I ran a series of commands from within the venv:

tf.test.is_built_with_cuda()

True

tf.config.list_physical_devices(‘GPU’)

Found device 0 with properties: pciBusID: 0000:01:00.0 name: Quadro M2000M computeCapability: 5.0 [PhysicalDevice(name=’/physical_device:GPU:0’, device_type=‘GPU’)]

python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000,1000])))"

tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: device kernel image is invalid

I am not sure how to troubleshoot this. I have a feeling that it is related to modifying the compilation such that tensorflow supports the compute capability of my device (5.0), but I am not sure how to proceed. Thank you!!

like image 634
cuda newb Avatar asked Aug 03 '20 14:08

cuda newb


2 Answers

I just had the same problem. I downgraded the Tensorflow2.3 version to 2.2 with following command.

pip install --upgrade tensorflow==2.2

It is working now but very slow.

like image 79
Ugurcan Avatar answered Sep 30 '22 01:09

Ugurcan


According to this github issue's explanation, Google Tensorflow engineering team already discarded the support for some older version's GPUs: https://github.com/tensorflow/tensorflow/issues/41990

I believe your GPU is in those lower versions GPU family. So downgrading your TF from 2.3. to 2.2 is a solution. And TF engineers suggest us to build the TF2.3 by ourselves and and change its building configuration scripts to enable lower versions GPUs supoort, but Google TF team doesn't confirm it could work, and no resposbility to fix any problem we'll encounter.

like image 35
Clock ZHONG Avatar answered Sep 30 '22 02:09

Clock ZHONG