I am training a LSTM network using Keras with tensorflow as backend. The network is used for energy load forecasting with the size of the dataset being (32292,24). But as the program runs, I am getting Nan values for the loss right from the first epoch. How can I solve this problem ?
PS: as far as data preprocessing is concerned, I divided each value by 100000 as initially each value is a 4 or 5 digit number. As a result my values should come in the range of (0,1).
def build_model():
    model = Sequential()
    layers = [1, 50, 100, 1]
    model.add(LSTM(input_dim=layers[0],output_dim=layers[1],return_sequenc
    es = True))     
    model.add(Dropout(0.2))
    model.add(LSTM(layers[2],return_sequences = False))
    model.add(Dropout(0.2))
    model.add(Dense(output_dim=layers[3]))
    model.add(Activation("linear"))
    start = time.time()
    model.compile(loss="mse", optimizer="rmsprop")
    print "Compilation Time : ", time.time() - start
return model
def run_network():
    global_start_time = time.time()
    epochs = 5000
    model = build_model()
    try:
        model.fit(x_train, y_train,batch_size = 400, nb_epoch=epochs,validation_split=0.05) 
        predicted = model.predict(x_test)
        predicted = np.reshape(predicted, (predicted.size,))
        except KeyboardInterrupt:
        print 'Training duration (s) : ', time.time() - global_start_time
    try:
        fig = plt.figure()
        ax = fig.add_subplot(111)
        ax.plot(predicted[:100])
        plt.show()
    except Exception as e:
          print str(e)
          print 'Training duration (s) : ' , time.time() -   global_start_time
return model, y_test, predicted
I changed the activation function of dense layer to 'softmax' (in my case it's about a multi-class classification), and it works.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With