Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras LSTM on CPU faster than GPU?

I am testing LSTM networks on Keras and I am getting much faster training on CPU (5 seconds/epoch on i2600k 16GB) than on GPU (35secs on Nvidia 1060 6GB). GPU utilisation runs at around 15%, and I never see it over 30% when trying other LSTM networks including the Keras examples. When I run other types of networks MLP and CNN the GPU is much faster. I am using the latest theano 0.9.0dev4 and keras 1.2.0

The sequence has 50,000 timesteps with 3 inputs (ints).

If the inputs are descending (3,2,1) the output is 0, and 1 if ascending, except if the last two were also ascending, then the output is 0 instead of 1.

After 250 epochs I get 99.97% accuracy, but why is the GPU so much slower? am I doing something wrong in the model? I tried various batch settings and still had the same issue.

def generate_data():
    X=[]
    Y=[]
    for i in range(50000):
        start=random.randint(1,100)
        d=random.randrange(-1,2,2) #-1 or 1
        param=[(start),(start+d),(start+d+d)]
        X.append(np.array(param))
        if d<0:
            Y.append([1,0])

        elif len(Y)>2 and d>0 and Y[-1][1]==1 and Y[-2][1]==1:
            Y.append([1,0])
        elif d>0:
            Y.append([0,1])
    X=np.array(X)
    Y=np.array(Y)
    return X,Y
X,Y = generate_data()
X=np.asarray(X,'float32')
Y=np.asarray(Y,'float32')
X=np.reshape(X,(1,len(X),3))
Y=np.reshape(Y,(1,len(Y),2))

model=Sequential()
model.add(LSTM(20, input_shape=(50000,3), return_sequences=True))
model.add(Dense(2))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer=RMSprop(), metrics=['accuracy'])
history = model.fit(X, Y,batch_size=100, nb_epoch=250, verbose=2)

Any thoughts? Thank you!

like image 351
George Battaglia Avatar asked Feb 01 '17 04:02

George Battaglia


People also ask

Is LSTM faster on GPU?

Accelerating Long Short-Term Memory using GPUs GPUs are the de-facto standard for LSTM usage and deliver a 6x speedup during training and 140x higher throughput during inference when compared to CPU implementations.

Does keras run seamlessly on GPU and CPU?

Via TensorFlow (or Theano, or CNTK), Keras is able to run seamlessly on both CPUs and GPUs. When running on CPU, TensorFlow is itself wrapping a low-level library for tensor operations, called Eigen.

Is GPU always faster than CPU?

Due to its parallel processing capability, a GPU is much faster than a CPU. For the hardware with the same production year, GPU peak performance can be ten-fold with significantly higher memory system bandwidth than a CPU. Further, GPUs provide superior processing power and memory bandwidth.


1 Answers

Use Keras' CuDNNLSTM cells for accelerated compute on Nvidia GPUs: https://keras.io/layers/recurrent/#cudnnlstm

It's simply changing the LSTM line to:

model.add(CuDNNLSTM(20, input_shape=(50000,3), return_sequences=True))
like image 189
Roy Shilkrot Avatar answered Oct 21 '22 19:10

Roy Shilkrot