Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple Recurrent Neural Network input shape

I am trying to code a very simple RNN example with keras but the results are not as expected.

My X_train is a repeated list with length 6000 like: 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, ...

I formatted this to shape: (6000, 1, 1)

My y_train is a repeated list with length 6000 like: 1, 0.8, 0.6, 0, 0, 0, 1, 0.8, 0.6, 0, ...

I formatted this to shape: (6000, 1)

In my understanding, the recurrent neural network should learn to predict the 0.8 and 0.6 correctly because it can remember the 1 in X_train two timesteps ago.

My model:

model=Sequential()
model.add(SimpleRNN(input_dim=1, output_dim=50))
model.add(Dense(output_dim=1, activation = "sigmoid"))
model.compile(loss="mse", optimizer="rmsprop")
model.fit(X_train, y_train, nb_epoch=10, batch_size=32)

The model can be trained successfully with minimal loss ~0.1015 but the results are not as expected.

test case ---------------------------------------------  model result -------------expected result 

model.predict(np.array([[[1]]])) --------------------0.9825--------------------1

model.predict(np.array([[[1],[0]]])) ----------------0.2081--------------------0.8

model.predict(np.array([[[1],[0],[0]]])) ------------0.2778 -------------------0.6

model.predict(np.array([[[1],[0],[0],[0]]]))---------0.3186--------------------0

Any hints what I am misunderstanding here?

like image 379
user3406687 Avatar asked Jul 10 '16 16:07

user3406687


1 Answers

The input format should be three-dimensional: the three components represent sample size, number of time steps and output dimension

Once appropriately reformatted the RNN does indeed manage to predict the target sequence well.

np.random.seed(1337)

sample_size = 256
x_seed = [1, 0, 0, 0, 0, 0]
y_seed = [1, 0.8, 0.6, 0, 0, 0]

x_train = np.array([[x_seed] * sample_size]).reshape(sample_size,len(x_seed),1)
y_train = np.array([[y_seed]*sample_size]).reshape(sample_size,len(y_seed),1)

model=Sequential()
model.add(SimpleRNN(input_dim  =  1, output_dim = 50, return_sequences = True))
model.add(TimeDistributed(Dense(output_dim = 1, activation  =  "sigmoid")))
model.compile(loss = "mse", optimizer = "rmsprop")
model.fit(x_train, y_train, nb_epoch = 10, batch_size = 32)

print(model.predict(np.array([[[1],[0],[0],[0],[0],[0]]])))
#[[[ 0.87810659]
#[ 0.80646527]
#[ 0.61600274]
#[ 0.01652312]
#[ 0.00930419]
#[ 0.01328572]]]
like image 92
Christian Hirsch Avatar answered Oct 20 '22 07:10

Christian Hirsch