How to use keras for XOR

Tags:

I want to practice keras by code a xor, but the result is not right, the followed is my code, thanks for everybody to help me.

from keras.models import Sequential
from keras.layers.core import Dense,Activation
from keras.optimizers import SGD
import numpy as np

model = Sequential()# two layers
model.add(Dense(input_dim=2,output_dim=4,init="glorot_uniform"))
model.add(Activation("sigmoid"))
model.add(Dense(input_dim=4,output_dim=1,init="glorot_uniform"))
model.add(Activation("sigmoid"))
sgd = SGD(l2=0.0,lr=0.05, decay=1e-6, momentum=0.11, nesterov=True)
model.compile(loss='mean_absolute_error', optimizer=sgd)
print "begin to train"
list1 = [1,1]
label1 = [0]
list2 = [1,0]
label2 = [1]
list3 = [0,0]
label3 = [0]
list4 = [0,1]
label4 = [1] 
train_data = np.array((list1,list2,list3,list4)) #four samples for epoch = 1000
label = np.array((label1,label2,label3,label4))

model.fit(train_data,label,nb_epoch = 1000,batch_size = 4,verbose = 1,shuffle=True,show_accuracy = True)
list_test = [0,1]
test = np.array((list_test,list1))
classes = model.predict(test)
print classes

Output

[[ 0.31851079] [ 0.34130159]] [[ 0.49635666] [0.51274764]]

493

asked Jul 22 '15 07:07

3 Answers

If I increase the number of epochs in your code to 50000 it does often converge to the right answer for me, just takes a little while :)

It does often get stuck, though. I get better convergence properties if I change your loss function to 'mean_squared_error', which is a smoother function.

I get still faster convergence if I use the Adam or RMSProp optimizers. My final compile line, which works:

model.compile(loss='mse', optimizer='adam')
...
model.fit(train_data, label, nb_epoch = 10000,batch_size = 4,verbose = 1,shuffle=True,show_accuracy = True)

113

answered Oct 20 '22 17:10

wxs

I used a single hidden layer with 4 hidden nodes, and it almost always converges to the right answer within 500 epochs. I used sigmoid activations.

answered Oct 20 '22 16:10

Below, the minimal neuron network architecture required to learn XOR which should be a (2,2,1) network. In fact, if maths shows that the (2,2,1) network can solve the XOR problem, but maths doesn't show that the (2,2,1) network is easy to train. It could sometimes takes a lot of epochs (iterations) or does not converge to the global minimum. That said, I've got easily good results with (2,3,1) or (2,4,1) network architectures.

The problem seems to be related to the existence of many local minima. Look at this 1998 paper, «Learning XOR: exploring the space of a classic problem» by Richard Bland. Furthermore weights initialization with random number between 0.5 and 1.0 helps to converge.

It works fine with Keras or TensorFlow using loss function 'mean_squared_error', sigmoid activation and Adam optimizer. Even with pretty good hyperparameters, I observed that the learned XOR model is trapped in a local minimum about 15% of the time.

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from tensorflow.keras import initializers
import numpy as np 

X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([[0],[1],[1],[0]])

def initialize_weights(shape, dtype=None):
    return np.random.normal(loc = 0.75, scale = 1e-2, size = shape)

model = Sequential()
model.add(Dense(2, 
                activation='sigmoid', 
                kernel_initializer=initialize_weights, 
                input_dim=2))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='mean_squared_error', 
              optimizer='adam', 
              metrics=['accuracy'])

print("*** Training... ***")

model.fit(X, y, batch_size=4, epochs=10000, verbose=0)

print("*** Training done! ***")

print("*** Model prediction on [[0,0],[0,1],[1,0],[1,1]] ***")

print(model.predict_proba(X))

*** Training... ***

*** Training done! ***

*** Model prediction on [[0,0],[0,1],[1,0],[1,1]] ***

[[0.08662204] [0.9235283 ] [0.92356336] [0.06672956]]

answered Oct 20 '22 17:10

Claude COULOMBE

Related questions
                            
                                How to stack data frames on top of each other in Pandas
                            
                                Why monitoring a keyboard interrupt in python thread doesn't work
                            
                                How to open chrome developer console using Selenium in Python?
                            
                                how to force buildout to use already installed package
                            
                                Instantiating a just-created class in a metaclass results in a RuntimeError ("super(): empty __class__ cell")
                            
                                Caffe: Drawing CNN Net
                            
                                Pandas: Applying Lambda to Multiple Data Frames
                            
                                What is the default value of j in (i:j:k) numpy slicing?
                            
                                Python Find highest row in a given column
                            
                                Python dictionary iterator performance
                            
                                Django ModelForm foreign key optgroup select
                            
                                Is there a significantly better way to find the most common word in a list (Python only)
                            
                                Paramiko, exec_command get the output stream continuously [duplicate]
                            
                                python mysql connector - What is the right way for multiple queries
                            
                                Can numpy argsort handle ties? [duplicate]
                            
                                Python Tornado Websocket Connections still open after being closed
                            
                                Arrow direction set by data, but length set by figure size
                            
                                Adaptive Bandwidth Kernel Density Estimation
                            
                                How to 'partially' install a Python package
                            
                                How to create a boxplot not showing the outliers using Python and Plotly?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use keras for XOR

Tags:

python

neural-network

keras

xor

Jaspn Wjbian

People also ask

3 Answers

wxs

Anon

Claude COULOMBE

Recent Activity

Donate For Us