Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is my GPU slower than CPU when training LSTM/RNN models?

My machine has the following spec:

CPU: Xeon E5-1620 v4

GPU: Titan X (Pascal)

Ubuntu 16.04

Nvidia driver 375.26

CUDA tookit 8.0

cuDNN 5.1

I've benchmarked on the following Keras examples with Tensorflow as the backed reference:

SCRIPT NAME                  GPU       CPU
stated_lstm.py               5sec      5sec 
babi_rnn.py                  10sec     12sec
imdb_bidirectional_lstm.py   240sec    116sec
imbd_lstm.py                 113sec    106sec

My gpu is clearly out performing my cpu in non-lstm models.

SCRIPT NAME                  GPU       CPU
cifar10_cnn.py               12sec     123sec
imdb_cnn.py                  5sec      119sec
mnist_cnn.py                 3sec      47sec 

Has anyone else experienced this?

like image 330
agsolid Avatar asked Jan 31 '17 01:01

agsolid


People also ask

Does GPU speed up LSTM?

Accelerating Long Short-Term Memory using GPUs The parallel processing capabilities of GPUs can accelerate the LSTM training and inference processes. GPUs are the de-facto standard for LSTM usage and deliver a 6x speedup during training and 140x higher throughput during inference when compared to CPU implementations.

Can RNN be trained with GPU?

Accelerating Recurrent Neural Networks using GPUs The parallel processing capabilities of GPUs can accelerate both the training and inference processes of RNNs.

Is GPU faster than CPU for neural network?

CPUs are everywhere and can serve as more cost-effective options for running AI-based solutions compared to GPUs. However, finding models that are both accurate and can run efficiently on CPUs can be a challenge. Generally speaking, GPUs are 3X faster than CPUs.

How can I make my LSTM faster?

How Do I Make My Lstm Train Faster? The best video card to purchase is the NVIDIA 1080 video card. A 3300 MHz Ram will run 32-GBs. Correctly install all of Cuda's packages (install them correctly on the system path).


3 Answers

If you use Keras, use CuDNNLSTM in place of LSTM or CuDNNGRU in place of GRU. In my case (2 Tesla M60), I am seeing 10x boost of performance. By the way I am using batch size 128 as suggested by @Alexey Golyshev.

like image 91
neurite Avatar answered Oct 18 '22 05:10

neurite


Too small batch size. Try to increase.

Results for my GTX1050Ti:

imdb_bidirectional_lstm.py
batch_size      time
32 (default)    252
64              131
96              87
128             66

imdb_lstm.py
batch_size      time
32 (default)    108
64              50
96              34
128             25
like image 20
Alexey Golyshev Avatar answered Oct 18 '22 06:10

Alexey Golyshev


It's just a tip.

Using GPU is powerful when

1. your neural network model is big.
2. batch size is big.

It's what I found from googling.

like image 13
Dane Lee Avatar answered Oct 18 '22 05:10

Dane Lee