Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keras network producing inverse predictions

I have a timeseries dataset and I am trying to train a network so that it overfits (obviously, that's just the first step, I will then battle the overfitting).

The network has two layers: LSTM (32 neurons) and Dense (1 neuron, no activation)

Training/model has these parameters: epochs: 20, steps_per_epoch: 100, loss: "mse", optimizer: "rmsprop".

TimeseriesGenerator produces the input series with: length: 1, sampling_rate: 1, batch_size: 1.

I would expect the network would just memorize such a small dataset (I have tried even much more complicated network to no avail) and the loss on training dataset would be pretty much zero. It is not and when I visualize the results on the training set like this:

y_pred = model.predict_generator(gen)
plot_points = 40
epochs = range(1, plot_points + 1)
pred_points = numpy.resize(y_pred[:plot_points], (plot_points,))
target_points = gen.targets[:plot_points]
plt.plot(epochs, pred_points, 'b', label='Predictions')
plt.plot(epochs, target_points, 'r', label='Targets')
plt.legend()
plt.show()

I get:

predictions and targets chart

The predictions have somewhat smaller amplitude but are precisely inverse to the targets. Btw. this is not memorized, they are inversed even for the test dataset which the algorithm hasn't trained on at all.It appears that instead of memorizing the dataset, my network just learned to negate the input value and slightly scale it down. Any idea why this is happening? It doesn't seem like the solution the optimizer should have converged to (loss is pretty big).

EDIT (some relevant parts of my code):

train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
        x,
        y,
        length=1,
        sampling_rate=1,
        batch_size=1,
        shuffle=False
    )

model = Sequential()
model.add(LSTM(32, input_shape=(1, 1), return_sequences=False))
model.add(Dense(1, input_shape=(1, 1)))

model.compile(
    loss="mse",
    optimizer="rmsprop",
    metrics=[keras.metrics.mean_squared_error]
)

history = model.fit_generator(
    train_gen,
    epochs=20,
    steps_per_epoch=100
)

EDIT (different, randomly generated dataset):

enter image description here

I had to increase number of LSTM neurons to 256, with the previous setting (32 neurons), the blue line was pretty much flat. However, with the increase the same pattern arises - inverse predictions with somewhat smaller amplitude.

EDIT (targets shifted by +1):

enter image description here

Shifting the targets by one compared to predictions doesn't produce much better fit. Notice the highlighted parts where the graph isn't just alternating, it's more apparent there.

EDIT (increased length to 2 ... TimeseriesGenerator(length=2, ...)):

enter image description here

With length=2 the predictions stop tracking the targets so closely but the overall pattern of inversion still stands.

like image 718
asdf Avatar asked Jan 14 '19 21:01

asdf


Video Answer


2 Answers

You say that your network "just learned to negate the input value and slightly scale it down". I don't think so. It is very likely that all you are seeing is the network performing poorly, and just predicting the previous value (but scaled as you say). This issue is something I've seen again and again. Here is another example, and another, of this issue. Also, remember it is very easy to fool yourself by shifting the data by one. It is very likely you are simply shifting the poor prediction back in time and getting an overlap.

like image 91
André C. Andersen Avatar answered Oct 25 '22 02:10

André C. Andersen


EDIT: After author's comments I do not believe this is the correct answer but I will keep it posted for posterity.

Great question and the answer is due to how the Time_generator works! Apparently instead of grabbing x,y pairs with the same index (e.g input x[0] to output target y[0]) it grabs target with offset 1 (so x[0] to y[1]).

Thus plotting y with offset 1 will produce the desired fit.

Code to simulate:

import keras 
import matplotlib.pyplot as plt

x=np.random.uniform(0,10,size=41).reshape(-1,1)
x[::2]*=-1
y=x[1:]
x=x[:-1]
train_gen = keras.preprocessing.sequence.TimeseriesGenerator(
        x,
        y,
        length=1,
        sampling_rate=1,
        batch_size=1,
        shuffle=False
    )

model = keras.models.Sequential()
model.add(keras.layers.LSTM(100, input_shape=(1, 1), return_sequences=False))
model.add(keras.layers.Dense(1))


model.compile(
    loss="mse",
    optimizer="rmsprop",
    metrics=[keras.metrics.mean_squared_error]
)
model.optimizer.lr/=.1

history = model.fit_generator(
    train_gen,
    epochs=20,
    steps_per_epoch=100
)

Proper plotting:

y_pred = model.predict_generator(train_gen)
plot_points = 39
epochs = range(1, plot_points + 1)
pred_points = np.resize(y_pred[:plot_points], (plot_points,))

target_points = train_gen.targets[1:plot_points+1] #NOTICE DIFFERENT INDEXING HERE

plt.plot(epochs, pred_points, 'b', label='Predictions')
plt.plot(epochs, target_points, 'r', label='Targets')
plt.legend()
plt.show()

Output, Notice how the fit is no longer inverted and is mostly very accurate:

With proper offset on the <code>target_points</code>

This is how it looks when the offset is incorrect:

Without proper offset

like image 24
tRosenflanz Avatar answered Oct 25 '22 02:10

tRosenflanz