I'm just getting started with Weka and having trouble with the first steps.
We've got our training set:
@relation PerceptronXOR @attribute X1 numeric @attribute X2 numeric @attribute Output numeric @data 1,1,-1 -1,1,1 1,-1,1 -1,-1,-1
First step I want to do is just train, and then classify a set using the Weka gui. What I've been doing so far:
Using Weka 3.7.0.
outputs:
=== Run information ===
Scheme: weka.classifiers.functions.MultilayerPerceptron -L 0.3 -M 0.2 -N 500 -V 0 -S 0 -E 20 -H 2 -R
Relation: PerceptronXOR
Instances: 4
Attributes: 3
X1
X2
Output
Test mode: evaluate on training data
=== Classifier model (full training set) ===
Linear Node 0
Inputs Weights
Threshold 0.21069691964232443
Node 1 1.8781169869419072
Node 2 -1.8403146612166397
Sigmoid Node 1
Inputs Weights
Threshold -3.7331156814378685
Attrib X1 3.6380519730323164
Attrib X2 -1.0420815868133226
Sigmoid Node 2
Inputs Weights
Threshold -3.64785119182632
Attrib X1 3.603244645539393
Attrib X2 0.9535137571446323
Class
Input
Node 0
Time taken to build model: 0 seconds
=== Evaluation on training set ===
=== Summary ===
Correlation coefficient 0.7047
Mean absolute error 0.6073
Root mean squared error 0.7468
Relative absolute error 60.7288 %
Root relative squared error 74.6842 %
Total Number of Instances 4
It seems odd that 500 iterations at 0.3 doesn't get it the error, but 5000 @ 0.1 does, so lets go with that.
Now use the test data set:
@relation PerceptronXOR @attribute X1 numeric @attribute X2 numeric @attribute Output numeric @data 1,1,-1 -1,1,1 1,-1,1 -1,-1,-1 0.5,0.5,-1 -0.5,0.5,1 0.5,-0.5,1 -0.5,-0.5,-1
=== Run information ===
Scheme: weka.classifiers.functions.MultilayerPerceptron -L 0.1 -M 0.2 -N 5000 -V 0 -S 0 -E 20 -H 2 -R
Relation: PerceptronXOR
Instances: 4
Attributes: 3
X1
X2
Output
Test mode: user supplied test set: size unknown (reading incrementally)
=== Classifier model (full training set) ===
Linear Node 0
Inputs Weights
Threshold -1.2208619057226187
Node 1 3.1172079341507497
Node 2 -3.212484459911485
Sigmoid Node 1
Inputs Weights
Threshold 1.091378074639599
Attrib X1 1.8621040828953983
Attrib X2 1.800744048145267
Sigmoid Node 2
Inputs Weights
Threshold -3.372580743113282
Attrib X1 2.9207154176666386
Attrib X2 2.576791630598144
Class
Input
Node 0
Time taken to build model: 0.04 seconds
=== Evaluation on test set ===
=== Summary ===
Correlation coefficient 0.8296
Mean absolute error 0.3006
Root mean squared error 0.6344
Relative absolute error 30.0592 %
Root relative squared error 63.4377 %
Total Number of Instances 8
Why is unable to classify these correctly?
Is it just because it's reached a local minimum quickly on the training data, and doesn't 'know' that that doesn't fit all the cases?
Questions.
Using learning rate with 0.5 does the job with 500 iterations for the both examples. The learning rate is how much weight it gives for new examples. Apparently the problem is difficult and it is easy to get in local minima with the 2 hidden layers. If you use a low learning rate with a high iteration number the learning process will be more conservative and more likely to high a good minimum.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With