Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

0 accuracy with LSTM

I trained LSTM classification model, but got weird results (0 accuracy). Here is my dataset with preprocessing steps:

import pandas as pd
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow import keras
import numpy as np

url = 'https://raw.githubusercontent.com/MislavSag/trademl/master/trademl/modeling/random_forest/X_TEST.csv'
X_TEST = pd.read_csv(url, sep=',')
url = 'https://raw.githubusercontent.com/MislavSag/trademl/master/trademl/modeling/random_forest/labeling_info_TEST.csv'
labeling_info_TEST = pd.read_csv(url, sep=',')


# TRAIN TEST SPLIT
X_train, X_test, y_train, y_test = train_test_split(
    X_TEST.drop(columns=['close_orig']), labeling_info_TEST['bin'],
    test_size=0.10, shuffle=False, stratify=None)


### PREPARE LSTM
x = X_train['close'].values.reshape(-1, 1)
y = y_train.values.reshape(-1, 1)
x_test = X_test['close'].values.reshape(-1, 1)
y_test = y_test.values.reshape(-1, 1)
train_val_index_split = 0.75
train_generator = keras.preprocessing.sequence.TimeseriesGenerator(
    data=x,
    targets=y,
    length=30,
    sampling_rate=1,
    stride=1,
    start_index=0,
    end_index=int(train_val_index_split*X_TEST.shape[0]),
    shuffle=False,
    reverse=False,
    batch_size=128
)
validation_generator = keras.preprocessing.sequence.TimeseriesGenerator(
    data=x,
    targets=y,
    length=30,
    sampling_rate=1,
    stride=1,
    start_index=int((train_val_index_split*X_TEST.shape[0] + 1)),
    end_index=None,  #int(train_test_index_split*X.shape[0])
    shuffle=False,
    reverse=False,
    batch_size=128
)
test_generator = keras.preprocessing.sequence.TimeseriesGenerator(
    data=x_test,
    targets=y_test,
    length=30,
    sampling_rate=1,
    stride=1,
    start_index=0,
    end_index=None,
    shuffle=False,
    reverse=False,
    batch_size=128
)

# convert generator to inmemory 3D series (if enough RAM)
def generator_to_obj(generator):
    xlist = []
    ylist = []
    for i in range(len(generator)):
        x, y = train_generator[i]
        xlist.append(x)
        ylist.append(y)
    X_train = np.concatenate(xlist, axis=0)
    y_train = np.concatenate(ylist, axis=0)
    return X_train, y_train

X_train_lstm, y_train_lstm = generator_to_obj(train_generator)
X_val_lstm, y_val_lstm = generator_to_obj(validation_generator)
X_test_lstm, y_test_lstm = generator_to_obj(test_generator)

# test for shapes
print('X and y shape train: ', X_train_lstm.shape, y_train_lstm.shape)
print('X and y shape validate: ', X_val_lstm.shape, y_val_lstm.shape)
print('X and y shape test: ', X_test_lstm.shape, y_test_lstm.shape)

and here is my model with resuslts:

### MODEL
model = keras.models.Sequential([
        keras.layers.LSTM(124, return_sequences=True, input_shape=[None, 1]),
        keras.layers.LSTM(258),
        keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train_lstm, y_train_lstm, epochs=10, batch_size=128,
                    validation_data=[X_val_lstm, y_val_lstm])
# history = model.fit_generator(train_generator, epochs=40, validation_data=validation_generator, verbose=1)
score, acc = model.evaluate(X_val_lstm, y_val_lstm,
                            batch_size=128)

historydf = pd.DataFrame(history.history)
historydf.head(10)

Why do I get 0 accuracy?

like image 561
Mislav Avatar asked Jun 18 '20 08:06

Mislav


People also ask

What is the accuracy of Lstm?

LSTM & Machine Learning models (89% accuracy)

What is SparseCategoricalAccuracy?

SparseCategoricalAccuracy classThis metric creates two local variables, total and count that are used to compute the frequency with which y_pred matches y_true . This frequency is ultimately returned as sparse categorical accuracy : an idempotent operation that simply divides total by count .

What does model evaluate return in keras?

The model. evaluate() return scalar test loss if the model has a single output and no metrics or list of scalars if the model has multiple outputs and multiple metrics. The attribute model. metrics_names will give you the display labels for the scalar outputs and metrics names.


1 Answers

You're using sigmoid activation, which means your labels must be in range 0 and 1. But in your case, the labels are 1. and -1.

Just replace -1 with 0.

for i, y in enumerate(y_train_lstm):
    if y == -1.:
        y_train_lstm[i,:] = 0. 
for i, y in enumerate(y_val_lstm):
    if y == -1.:
        y_val_lstm[i,:] = 0. 

for i, y in enumerate(y_test_lstm):
    if y == -1.:
        y_test_lstm[i,:] = 0. 

Sidenote:

enter image description here

The signals are very close, it would be hard to distinguish them. So, probably accuracy won't be high with simple models.

After training with 0. and 1. labels,

model = keras.models.Sequential([
        keras.layers.LSTM(124, return_sequences=True, input_shape=(30, 1)),
        keras.layers.LSTM(258),
        keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train_lstm, y_train_lstm, epochs=5, batch_size=128,
                    validation_data=(X_val_lstm, y_val_lstm))
# history = model.fit_generator(train_generator, epochs=40, validation_data=validation_generator, verbose=1)
score, acc = model.evaluate(X_val_lstm, y_val_lstm,
                            batch_size=128)

historydf = pd.DataFrame(history.history)
historydf.head(10)
Epoch 1/5
12/12 [==============================] - 5s 378ms/step - loss: 0.7386 - accuracy: 0.4990 - val_loss: 0.6959 - val_accuracy: 0.4896
Epoch 2/5
12/12 [==============================] - 4s 318ms/step - loss: 0.6947 - accuracy: 0.5133 - val_loss: 0.6959 - val_accuracy: 0.5104
Epoch 3/5
12/12 [==============================] - 4s 318ms/step - loss: 0.6941 - accuracy: 0.4895 - val_loss: 0.6930 - val_accuracy: 0.5104
Epoch 4/5
12/12 [==============================] - 4s 332ms/step - loss: 0.6946 - accuracy: 0.5269 - val_loss: 0.6946 - val_accuracy: 0.5104
Epoch 5/5
12/12 [==============================] - 4s 334ms/step - loss: 0.6931 - accuracy: 0.4901 - val_loss: 0.6929 - val_accuracy: 0.5104
3/3 [==============================] - 0s 73ms/step - loss: 0.6929 - accuracy: 0.5104

    loss    accuracy    val_loss    val_accuracy
0   0.738649    0.498980    0.695888    0.489583
1   0.694708    0.513256    0.695942    0.510417
2   0.694117    0.489463    0.692987    0.510417
3   0.694554    0.526852    0.694613    0.510417
4   0.693118    0.490143    0.692936    0.510417

Source code in colab: https://colab.research.google.com/drive/10yRf4TfGDnp_4F2HYoxPyTlF18no-8Dr?usp=sharing

like image 160
Zabir Al Nazi Avatar answered Oct 23 '22 20:10

Zabir Al Nazi