Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot train a neural network solving XOR mapping

I am trying to implement a simple classifier for the XOR problem in Keras. Here is the code:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD
import numpy

X = numpy.array([[1., 1.], [0., 0.], [1., 0.], [0., 1.], [1., 1.], [0., 0.]])
y = numpy.array([[0.], [0.], [1.], [1.], [0.], [0.]])
model = Sequential()
model.add(Dense(2, input_dim=2, init='uniform', activation='sigmoid'))
model.add(Dense(3, init='uniform', activation='sigmoid'))
model.add(Dense(1, init='uniform', activation='softmax'))
sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='mean_squared_error', optimizer=sgd)

model.fit(X, y, nb_epoch=20)
print()
score = model.evaluate(X, y)
print()
print(score)
print(model.predict(numpy.array([[1, 0]])))
print(model.predict(numpy.array([[0, 0]])))

I tried changing the number of epochs, learning rate, and other parameters. But the error remains constant from the first to the last epoch.

Epoch 13/20
6/6 [==============================] - 0s - loss: 0.6667 
Epoch 14/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 15/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 16/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 17/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 18/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 19/20
6/6 [==============================] - 0s - loss: 0.6667
Epoch 20/20
6/6 [==============================] - 0s - loss: 0.6667

6/6 [==============================] - 0s

0.666666686535
[[ 1.]]
[[ 1.]]

How do you train this network in Keras?

Also, is there a better library for implementing neural networks? I tried PyBrain, but it has been abandoned, tried scikit-neuralnetwork but the documentation is really cryptic so couldn't figure out how to train it. And I seriously doubt if Keras even works.

like image 542
Aditya Shinde Avatar asked Dec 16 '15 12:12

Aditya Shinde


1 Answers

In your example, you have a Dense layer with 1 unit with a softmax activation. The value of such a unit will always be 1.0, so no information can flow from your inputs to your outputs, and the network won't do anything. Softmax is only really useful when you need to generate a prediction of a probability among n classes, where n is greater than 2.

The other answers suggest changes to the code to make it work. Just removing activation='softmax' may be enough.

Keras does generally work.

like image 130
JeremyR Avatar answered Sep 24 '22 20:09

JeremyR