Whenever I try out LSTM models on Keras, it seems that the model is impossible to train due to long training time.
For instance, a model like this takes 80 seconds per step to train.:
def create_model(self):
inputs = {}
inputs['input'] = []
lstm = []
placeholder = {}
for tf, v in self.env.timeframes.items():
inputs[tf] = Input(shape = v['shape'], name = tf)
lstm.append(LSTM(8)(inputs[tf]))
inputs['input'].append(inputs[tf])
account = Input(shape = (3,), name = 'account')
account_ = Dense(8, activation = 'relu')(account)
dt = Input(shape = (7,), name = 'dt')
dt_ = Dense(16, activation = 'relu')(dt)
inputs['input'].extend([account, dt])
data = Concatenate(axis = 1)(lstm)
data = Dense(128, activation = 'relu')(data)
y = Concatenate(axis = 1)([data, account, dt])
y = Dense(256, activation = 'relu')(y)
y = Dense(64, activation = 'relu')(y)
y = Dense(16, activation = 'relu')(y)
output = Dense(3, activation = 'linear')(y)
model = Model(inputs = inputs['input'], outputs = output)
model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae'])
return model
Whereas model which has LSTM substituded with Flatten + Dense like this:
def create_model(self):
inputs = {}
inputs['input'] = []
lstm = []
placeholder = {}
for tf, v in self.env.timeframes.items():
inputs[tf] = Input(shape = v['shape'], name = tf)
#lstm.append(LSTM(8)(inputs[tf]))
placeholder[tf] = Flatten()(inputs[tf])
lstm.append(Dense(32, activation = 'relu')(placeholder[tf]))
inputs['input'].append(inputs[tf])
account = Input(shape = (3,), name = 'account')
account_ = Dense(8, activation = 'relu')(account)
dt = Input(shape = (7,), name = 'dt')
dt_ = Dense(16, activation = 'relu')(dt)
inputs['input'].extend([account, dt])
data = Concatenate(axis = 1)(lstm)
data = Dense(128, activation = 'relu')(data)
y = Concatenate(axis = 1)([data, account, dt])
y = Dense(256, activation = 'relu')(y)
y = Dense(64, activation = 'relu')(y)
y = Dense(16, activation = 'relu')(y)
output = Dense(3, activation = 'linear')(y)
model = Model(inputs = inputs['input'], outputs = output)
model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae'])
return model
takes 45-50 ms per step to train.
Is there something wrong in the model that is causing this? Or is this as fast as this model will run?
-- self.env.timeframes looks like this: dictionary with 9 items
timeframes = {
's1': {
'lookback': 86400,
'word': '1 s',
'unit': 1,
'offset': 12
},
's5': {
'lookback': 200,
'word': '5 s',
'unit': 5,
'offset': 2
},
'm1': {
'lookback': 100,
'word': '1 min',
'unit': 60,
'offset': 0
},
'm5': {
'lookback': 100,
'word': '5 min',
'unit': 300,
'offset': 0
},
'm30': {
'lookback': 100,
'word': '30 min',
'unit': 1800,
'offset': 0
},
'h1': {
'lookback': 200,
'word': '1 h',
'unit': 3600,
'offset': 0
},
'h4': {
'lookback': 200,
'word': '4 h',
'unit': 14400,
'offset': 0
},
'h12': {
'lookback': 100,
'word': '12 h',
'unit': 43200,
'offset': 0
},
'd1': {
'lookback': 200,
'word': '1 d',
'unit': 86400,
'offset': 0
}
}
GPU info from prompt -
2018-06-30 07:35:16.204320: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2018-06-30 07:35:16.495832: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1356] Found device 0 with properties:
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.86
pciBusID: 0000:01:00.0
totalMemory: 8.00GiB freeMemory: 6.59GiB
2018-06-30 07:35:16.495981: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1435] Adding visible gpu devices: 0
2018-06-30 07:35:16.956743: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-06-30 07:35:16.956827: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:929] 0
2018-06-30 07:35:16.957540: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:942] 0: N
2018-06-30 07:35:16.957865: I T:\src\github\tensorflow\tensorflow\core\common_runtime\gpu\gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6370 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
Solution with LSTM Model The LSTM model is trained with the TCLab 4 hours data set for 10 epochs. The loss function decreases for the first few epochs and then does not significantly change after that. The model predictions have good agreement with the measurements.
Why Is Lstm Slow? Overgrowing the neuronal population found in the hidden layer can mean that LSTMs are notoriously slow to train. Due to the length of time steps involved in the data generation process, you typically have to create a full blown RNN for LSTM input after a single time step.
About training RNN/LSTM: RNN and LSTM are difficult to train because they require memory-bandwidth-bound computation, which is the worst nightmare for hardware designer and ultimately limits the applicability of neural networks solutions.
If you are using GPU please replace all LSTM layers with CuDNNLSTM layers. You can import it from keras.layers
:
from keras.layers import CuDNNLSTM
def create_model(self):
inputs = {}
inputs['input'] = []
lstm = []
placeholder = {}
for tf, v in self.env.timeframes.items():
inputs[tf] = Input(shape = v['shape'], name = tf)
lstm.append(CuDNNLSTM(8)(inputs[tf]))
inputs['input'].append(inputs[tf])
account = Input(shape = (3,), name = 'account')
account_ = Dense(8, activation = 'relu')(account)
dt = Input(shape = (7,), name = 'dt')
dt_ = Dense(16, activation = 'relu')(dt)
inputs['input'].extend([account, dt])
data = Concatenate(axis = 1)(lstm)
data = Dense(128, activation = 'relu')(data)
y = Concatenate(axis = 1)([data, account, dt])
y = Dense(256, activation = 'relu')(y)
y = Dense(64, activation = 'relu')(y)
y = Dense(16, activation = 'relu')(y)
output = Dense(3, activation = 'linear')(y)
model = Model(inputs = inputs['input'], outputs = output)
model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae'])
return model
Here is more information: https://keras.io/layers/recurrent/#cudnnlstm
This will significantly speed up the model =)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With