Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LSTM time series - strange val_accuarcy, which normalizing method to use and what to do in production after model is fited

I am making LSTM time series prediction. My data looks like this

enter image description here So basically what I have is

IDTime: Int for each day

TimePart: 0 = NightTime, 1 = Morning, 2 = Afternoon

And 4 columns for values I am trying to predict

I have 2686 values, 3x values per day, so around 900 values in total + added new missing values

I read and did something like https://www.tensorflow.org/tutorials/structured_data/time_series

  1. ReplacedMissingData - Added missing IDTimes 0-Max, each containing TimePart 0-3 with 0 values (if missing). And replaced all NULL values with 0. I also removed Date parameter, because I have IDTime
  2. Set Data (Pandas DataFrame) index as IDTime and TimePart
  3. Copied features that I want
features_considered = ['TimePart', 'NmbrServices', 'LoggedInTimeMinutes','NmbrPersons', 'NmbrOfEmployees']
features = data[features_considered]
features.index = data.index
  1. Used Mean/STD on Trained data. I am creating 4 different models for each feature I am trying to predict. I this current one I have set currentFeatureIndex = 1, which is NmbServices
    currentFeatureIndex = 1
    TRAIN_SPLIT = int(dataset[:,currentFeatureIndex].size * 80 / 100)
    tf.random.set_seed(13)
    dataset = features.values
    data_mean = dataset[:TRAIN_SPLIT].mean(axis=0)
    data_std = dataset[:TRAIN_SPLIT].std(axis=0)
  1. I then created dataset. Previous X values with next 3 Future values I want to predict. I am using multivariate_data from tensorflow example, with removed steps
    x_train_multi, y_train_multi = multivariate_data(dataset, dataset[:,currentFeatureIndex], 0,TRAIN_SPLIT, past_history,future_target)

    x_val_multi, y_val_multi = multivariate_data(dataset, dataset[:,currentFeatureIndex],TRAIN_SPLIT, None, past_history,future_target)


    print ('History shape : {}'.format(x_train_multi[0].shape))
    print ('\n Target shape: {}'.format(y_train_multi[0].shape))

    BATCH_SIZE = 1024
    BUFFER_SIZE = 8096

    train_data_multi = tf.data.Dataset.from_tensor_slices((x_train_multi, y_train_multi))

    train_data_multi =train_data_multi.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE).repeat()

    val_data_multi = tf.data.Dataset.from_tensor_slices((x_val_multi, y_val_multi))
            val_data_multi = val_data_multi.batch(BATCH_SIZE).repeat()

    multi_step_model = tf.keras.models.Sequential()
    multi_step_model.add(tf.keras.layers.LSTM(32, activation='relu'))
    multi_step_model.add(tf.keras.layers.Dropout(0.1))
    multi_step_model.add(tf.keras.layers.Dense(future_target))

            multi_step_model.compile(optimizer=tf.keras.optimizers.RMSprop(clipvalue=1.0), loss='mae', metrics=['accuracy'])

    EVALUATION_INTERVAL = 200
    EPOCHS = 25

    currentName = 'test'
    csv_logger = tf.keras.callbacks.CSVLogger(currentName + '.log', separator=',', append=False)

    multi_step_history = multi_step_model.fit(train_data_multi, epochs=EPOCHS, steps_per_epoch=EVALUATION_INTERVAL, validation_data=val_data_multi, validation_steps=50, callbacks = [csv_logger])

In this example I also removed first 800 values with data[600:], because data is not as it should be, after replacing missing values.

And I get this final value after 25 ecphoes

 200/200 [==============================] - 12s 61ms/step - loss: 0.1540 - accuracy: 0.9505 - val_loss: 0.1599 - val_accuracy: 1.0000

Questions:

  1. Why is it that the val_accuracy is always 1.0? This happens for most of the features

  2. I also tried normalizing values from 0-1 with:

    features.loc[:,'NmbrServices'] / features.loc[:,'NmbrServices'].max() and I get:

    200/200 [==============================] - 12s 60ms/step - loss: 0.0461 - accuracy: 0.9538 - val_loss: 0.0434 - val_accuracy: 1.0000

    For this feature, I use here, it looks better using feature/featureMax, but for other features I can get: Using mean/std:

    • loss: 0.1461 - accuracy: 0.9338 - val_loss: 0.1634 - val_accuracy: 1.0000

    And when using feature / featureMax, I get:

    • loss: 0.0323 - accuracy: 0.8523 - val_loss: 0.0463 - val_accuracy: 1.0000

    In this case, which one is better? The one with higher accuracy or the one with lower losses?

  3. If I get some good Val_loss and Train_loss at around 8 epochs and then it goes up, can I then just train model until 8 epochs an save it?

  4. In the end I save model in H5 format and load it, because I want to predict new values for the next day, using last 45 values for prediction. How can I then fit this new data to the model. Do you just call model.fit(newDataX, newDataY)? Or do you need to compile it again on new data?

    4.1 How many times should you rerun this model then if you ran it on Year 2016-2018 and u are currently in year 2020, should you for example recompile it once per year with data from 2017-2019?

  5. Is it possible to predict multiple features for next day or is it better to use multiple models?

like image 930
Tomek Avatar asked Dec 06 '19 14:12

Tomek


1 Answers

I would suggest you to use batch normalization and it completely depends on you if you want to use Vanilla LSTM or Stacked LSTM.

I would recommend you to go through this.

like image 123
champion-runner Avatar answered Oct 16 '22 04:10

champion-runner