Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sknn - input dimension mismatch on second fit

I was attempting to create a neural network that utilizes reinforcement learning. I picked scikit-neuralnetwork as the library (because it's simple). It seems though, that fitting twice crashes Theano.

Here's the simplest code that causes the crash (Note, it doesn't matter what layers there are, nor does the learning rate or n_iter):

import numpy as np
from sknn.mlp import Classifier, Layer

clf = Classifier(
    layers=[
        Layer("Softmax")
        ],
    learning_rate=0.001,
    n_iter=1)

clf.fit(np.array([[0.]]), np.array([[0.]])) # Initialize the network for learning

X = np.array([[-1.], [1.]])
Y = np.array([[1.], [0.]])

clf.fit(X, Y) # crash

And here's the error I got:

ValueError: Input dimension mis-match. (input[0].shape[1] = 2, input[1].shape[1] = 1)
Apply node that caused the error: Elemwise{Mul}[(0, 1)](y, LogSoftmax.0)
Toposort index: 12
Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
Inputs shapes: [(1L, 2L), (1L, 1L)]
Inputs strides: [(16L, 8L), (8L, 8L)]
Inputs values: [array([[ 1.,  0.]]), array([[ 0.]])]
Outputs clients: [[Sum{axis=[1], acc_dtype=float64}(Elemwise{Mul}[(0, 1)].0)]]

Tested in Python 2.7.11

Does sknn not support fitting multiple times, or am I doing some idiotic mistake? If it doesn't, how are you supposed to implement reinforcement learning?

like image 592
seequ Avatar asked Jun 24 '16 08:06

seequ


1 Answers

I don't use sknn very often however it's very similar to sklearn so I might be able to help!

First of all when using the fit method you will reinitialise the weights, if you want to update the weights based on new data you should use the partial_fit method.

With regards to the crash, it's because you're X array is a different shape in the first dimension rather than the second.

import numpy as np
from sknn.mlp import Classifier, Layer

clf = Classifier(
    layers=[
        Layer("Softmax")
        ],
    learning_rate=0.001,
    n_iter=1)

# Original training data
X = np.array([[0.]])
Y = np.array([[0.]])
print X.shape, Y.shape

# Data used for second fitting
X = np.array([[-1.], [1.]])
Y = np.array([[1.], [0.]])
print X.shape, Y.shape


# Use the partial fit method to update weights
clf.partial_fit(X, Y) # Initialize the network for learning
clf.partial_fit(X, Y) # Update the weights


# Multiple training examples by stacking two on top of each other
X = np.concatenate((X, X))
Y = np.concatenate((Y, Y))
print X.shape, Y.shape

clf.partial_fit(X, Y)

Outputs:

(1, 1) (1, 1)
(2, 1) (2, 1)
(4, 1) (4, 1)
like image 66
piman314 Avatar answered Sep 26 '22 01:09

piman314