I have trained a model using CUDNNLSTM in tensorflow using GPU. When I try to use the model in cpu for inferencing, I am getting this error:
Invalid argument: No OpKernel was registered to support Op 'CudnnRNN' with these attrs. Registered devices: [CPU], Registered kernels:
<no registered kernels>
[[Node: cudnn_lstm/CudnnRNN = CudnnRNN[T=DT_FLOAT, direction="bidirectional", dropout=0, input_mode="linear_input", is_training=false, rnn_mode="lstm", seed=87654321, seed2=4567](Reshape_1, cudnn_lstm/zeros, cudnn_lstm/zeros_1, cudnn_lstm/opaque_kernel/read)]]
So, how can we use this model in cpu?
Please have a look at the comments in tensorflow source code of the CuDNN LSTM layer at: https://github.com/tensorflow/tensorflow/blob/r1.6/tensorflow/contrib/cudnn_rnn/python/layers/cudnn_rnn.py
They have described how to do, what you want, from line 83 onwards. Basically, after using CuDNN layers for training, you need to transfer weights to a model made with CuDNN Compatible LSTM Cells. Such a model will run on both CPU and GPU. Also, as far as I know, CuDNN LSTM layers in tensorflow are time major so don't forget to transpose your inputs (I'm not sure about this in the latest tensorflow version, please confirm this).
For a short complete example based on above, checkout melgor's gist:
https://gist.github.com/melgor/41e7d9367410b71dfddc33db34cba85f?short_path=29ebfc6
Reason: tensorflow doesn`t see your GPU
Summary:
1. check if tensorflow sees your GPU (optional)
2. check if your videocard can work with tensorflow (optional)
3. find versions of CUDA Toolkit and cuDNN SDK, compatible with your tf version
(https://www.tensorflow.org/install/source#linux)
4. install CUDA Toolkit
(https://developer.nvidia.com/cuda-toolkit-archive)
5. install cuDNN SDK
(https://developer.nvidia.com/rdp/cudnn-archive)
6. pip uninstall tensorflow; pip install tensorflow-gpu
7. check if tensorflow sees your GPU
* source - https://www.tensorflow.org/install/gpu
Detailed instruction:
1. check if tensorflow sees your GPU (optional)
from tensorflow.python.client import device_lib
def get_available_devices():
local_device_protos = device_lib.list_local_devices()
return [x.name for x in local_device_protos]
print(get_available_devices())
# my output was => ['/device:CPU:0']
# good output must be => ['/device:CPU:0', '/device:GPU:0']
2. check if your card can work with tensorflow (optional)
* my PC: GeForce GTX 1060 notebook (driver version - 419.35), windows 10, jupyter notebook
* tensorflow needs Compute Capability 3.5 or higher. (https://www.tensorflow.org/install/gpu#hardware_requirements)
- https://developer.nvidia.com/cuda-gpus
- select "CUDA-Enabled GeForce Products"
- result - "GeForce GTX 1060 Compute Capability = 6.1"
- my card can work with tf!
3. find versions of CUDA Toolkit and cuDNN SDK, that you need
a) find your tf version
import tensorflow as tf
print(tf.__version__)
# my output was => 1.13.1
b) find right versions of CUDA Toolkit and cuDNN SDK for your tf version
https://www.tensorflow.org/install/source#linux
* it is written for linux, but worked in my case
see, that tensorflow_gpu-1.13.1 needs: CUDA Toolkit v10.0, cuDNN SDK v7.4
4. install CUDA Toolkit
a) install CUDA Toolkit 10.0
https://developer.nvidia.com/cuda-toolkit-archive
select: CUDA Toolkit 10.0 and download base installer (2 GB)
installation settings: select only CUDA
(my installation path was: D:\Programs\x64\Nvidia\Cuda_v_10_0\Development)
b) add environment variables:
system variables / path must have:
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\bin
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\libnvvp
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\extras\CUPTI\libx64
D:\Programs\x64\Nvidia\Cuda_v_10_0\Development\include
5. install cuDNN SDK
a) download cuDNN SDK v7.4
https://developer.nvidia.com/rdp/cudnn-archive (needs registration, but it is simple)
select "Download cuDNN v7.4.2 (Dec 14, 2018), for CUDA 10.0"
b) add path to 'bin' folder into "environment variables / system variables / path":
D:\Programs\x64\Nvidia\cudnn_for_cuda_10_0\bin
6. pip uninstall tensorflow
pip install tensorflow-gpu
7. check if tensorflow sees your GPU
restart your PC
print(get_available_devices())
# now this code should return => ['/device:CPU:0', '/device:GPU:0']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With