Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How to prepare a dataset for Keras?


To run a set of labeled vectors through Keras neural network.


Looking at Keras dataset example mnist:

keras.datasets import mnist
(x_tr, y_tr), (x_te, y_te) = mnist.load_data()
print x_tr.shape

It seem to be a 3 dimensional numpy array:

(60000, 28, 28)
  • 1st dimension is for the samples
  • 2nd and 3rd for each sample features


Building the labeled vectors:

X_train = numpy.array([[1] * 128] * (10 ** 4) + [[0] * 128] * (10 ** 4))
X_test = numpy.array([[1] * 128] * (10 ** 2) + [[0] * 128] * (10 ** 2))

Y_train = numpy.array([True] * (10 ** 4) + [False] * (10 ** 4))
Y_test = numpy.array([True] * (10 ** 2) + [False] * (10 ** 2))

X_train = X_train.astype("float32")
X_test = X_test.astype("float32")

Y_train = Y_train.astype("bool")
Y_test = Y_test.astype("bool")

The training code

model = Sequential()
model.add(Dense(128, 50))
model.add(Dense(50, 50))
model.add(Dense(50, 1))

rms = RMSprop()
model.compile(loss='binary_crossentropy', optimizer=rms)

model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
          show_accuracy=True, verbose=2, validation_data=(X_test, Y_test))

score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=0)
print('Test score:', score[0])
print('Test accuracy:', score[1])


Test score: 13.9705320154
Test accuracy: 1.0

Why do I get such a bad result for such a simple dataset? Is my dataset malformed?


like image 735
Michael Avatar asked Aug 07 '15 14:08


Video Answer

1 Answers

A softmax over just one output node doesn't make much sense. If you change model.add(Activation('softmax')) to model.add(Activation('sigmoid')), your network performs well.

Alternatively you can also use two output nodes, where 1, 0 represents the case of True and 0, 1 represents the case of False. Then you can use a softmax layer. You just have to change your Y_train and Y_test accordingly.

like image 178
aleju Avatar answered Oct 16 '22 20:10
