Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using cuDNN kernel for LSTM

I want to train my RNN model using Cudnn:

max_length <- 140 
embedding_dim <- 128

model <- keras_model_sequential()

# define model
model %>% 
  # layer input
  layer_embedding(
    name = "input",
    input_dim = num_words,
    input_length = max_length,
    output_dim = embedding_dim, 
    embeddings_initializer = initializer_random_uniform(minval = -0.05, maxval = 0.05, seed = 2)
  ) %>%
  # layer dropout
  layer_spatial_dropout_1d(
    name = "embedding_dropout",
    rate = 0.2
  ) %>%
  # layer lstm 1
  bidirectional(layer_lstm(
    name = "lstm",
    units = 64,
    unroll = FALSE,
    dropout = 0.2,
    use_bias = TRUE,
    recurrent_dropout = 0,
    return_sequences = TRUE
  )) %>% 
  layer_batch_normalization() %>%
  # layer output
  layer_dense(
    name = "output",
    units = 3,
    activation = "softmax"
  )

when I run this I get this warming:

WARNING:tensorflow:Layer lstm will not use cuDNN kernel since it doesn't meet the cuDNN kernel criteria. It will use generic GPU kernel as fallback when running on GPU

I think I have followed all the requirements, not sure what I'm missing.

SessionInfo:

R version 4.0.0 (2020-04-24)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] keras_2.3.0.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6     lattice_0.20-41  zeallot_0.1.0    rappdirs_0.3.1  
 [5] grid_4.0.0       R6_2.4.1         jsonlite_1.6.1   magrittr_1.5    
 [9] tfruns_1.4       whisker_0.4      Matrix_1.2-18    reticulate_1.15 
[13] generics_0.0.2   tools_4.0.0      xfun_0.14        compiler_4.0.0  
[17] base64enc_0.1-3  tensorflow_2.2.0 knitr_1.28   
like image 336
capiono Avatar asked May 27 '20 13:05

capiono


People also ask

What is CuDNN LSTM?

In Keras , the high-level deep learning library, there are multiple types of recurrent layers; these include LSTM (Long short term memory) and CuDNNLSTM . According to the Keras documentation, a CuDNNLSTM is a: Fast LSTM implementation backed by CuDNN. Can only be run on GPU, with the TensorFlow backend.

Does TensorFlow support LSTM?

TensorFlow Lite also provides a way to convert user defined LSTM implementations.

What is the default activation function in LSTM?

Activation function to use. Default: hyperbolic tangent ( tanh ). If you pass None , no activation is applied (ie. "linear" activation: a(x) = x ).

What's the output shape of a bidirectional LSTM layer with 64 units?

This is because you are using Bidirectional layer, it will be concatenated by a forward and backward pass and so you output will be (None, None, 64+64=128) .


1 Answers

I ran into the same problem and fixed it by manually setting the options to use the cuDNN-compatible implementation as specified here.

"Based on available runtime hardware and constraints, this layer will choose different implementations (cuDNN-based or pure-TensorFlow) to maximize the performance. If a GPU is available and all the arguments to the layer meet the requirement of the CuDNN kernel (see below for details), the layer will use a fast cuDNN implementation."

The requirements to use the cuDNN implementation are:

  1. activation == tanh
  2. recurrent_activation == sigmoid
  3. recurrent_dropout == 0
  4. unroll is False
  5. use_bias is True
  6. Inputs, if use masking, are strictly right-padded.
  7. Eager execution is enabled in the outermost context.

In particular, I had to specify recurrent_activation == sigmoid. The version of Keras/TF I installed had a default of recurrent_activation == hard_sigmoid.

like image 199
norinhara Avatar answered Oct 24 '22 00:10

norinhara