I use this notebook from Kaggle to run LSTM neural network. I had started training of neural network and I saw that it is too slow. It is almost three times slower than CPU training. <ul> <li> <code>CPU perfomance:</code> 8 min per epoch;</li> <li> <code>GPU perfomance:</code> 26 min per epoch.</li> </ul> After this I decided to find answer in this question on Stackoverflow and I applied a <code>CuDNNLSTM</code> (which runs only on GPU) instead of <code>LSTM</code>. Hence, GPU perfomance became only 1 min per epoch and accuracy of model decreased on 3%. Questions: 1) Does somebody know why GPU works slower than CPU in the classic <code>LSTM</code> layer? I do not understand why this happens. 2) Why when I use <code>CuDNNLSTM</code> instead of <code>LSTM</code>, training become much more faster and the accuracy of the model decrease? P.S.: <code>My CPU:</code> Intel Core i7-7700 Processor (8M Cache, up to 4.20 GHz) <code>My GPU:</code> nVidia GeForce GTX 1050 Ti (4 GB)

I had a similar problem today and found two things that may be helpful to others (this is a regression problem on a data set with ~2.1MM rows, running on a machine with 4 P100 GPUs): <ol> <li>Using the CuDNNLSTM layer instead of the LSTM layer on a GPU machine reduced the fit time from ~13500 seconds to ~400 seconds per epoch.</li> <li>Increasing the batch size (~500 to ~4700) reduced it to ~130 seconds per epoch.</li> </ol> Reducing the batch size has increase loss and val loss, so you'll need to make a decision about the trade offs you want to make.

Why is Keras LSTM on CPU three times faster than GPU?

1 Answers

I had a similar problem today and found two things that may be helpful to others (this is a regression problem on a data set with ~2.1MM rows, running on a machine with 4 P100 GPUs):

Using the CuDNNLSTM layer instead of the LSTM layer on a GPU machine reduced the fit time from ~13500 seconds to ~400 seconds per epoch.
Increasing the batch size (~500 to ~4700) reduced it to ~130 seconds per epoch.

Reducing the batch size has increase loss and val loss, so you'll need to make a decision about the trade offs you want to make.

answered Sep 23 '22 23:09

ericbdevil

Related questions
                            
                                Round to nearest 1000 in pandas
                            
                                Pandas, how to combine multiple columns into an array column
                            
                                Django '/' only homepage url error
                            
                                Making numpy arrays JSON serializable
                            
                                opposite of df.diff() in pandas
                            
                                What does x in range(...) == y mean in Python 3? [duplicate]
                            
                                Django's template tag inside javascript
                            
                                Unit test pyspark code using python
                            
                                Python: Normalize image exposure
                            
                                Keep a column with a categorical variable in Pandas with groupby and mean()
                            
                                Flask ImportError: cannot import name app
                            
                                How to apply a condition to pandas iloc
                            
                                Why Doc2vec gives 2 different vectors for the same texts
                            
                                Start CloudSQL Proxy on Python Dataflow / Apache Beam
                            
                                Custom weight initialization in PyTorch
                            
                                Individual axes limits for pairplot in python
                            
                                Prune unnecessary leaves in sklearn DecisionTreeClassifier
                            
                                Using numpy.vstack in numba
                            
                                K-means using only specific dataframe columns with scikit-learn
                            
                                How to combine multiple rows into a single row with python pandas based on the values of multiple columns?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why is Keras LSTM on CPU three times faster than GPU?

Tags:

python

machine-learning

tensorflow

gpu

keras

lemon

People also ask

1 Answers

ericbdevil

Recent Activity

Donate For Us