I'm trying to implement a neural network that classifies images into one of the two discrete categories. The problem is, however, that it currently always predicts 0 for any input and I'm not really sure why.
Here's my feature extraction method:
def extract(file):
# Resize and subtract mean pixel
img = cv2.resize(cv2.imread(file), (224, 224)).astype(np.float32)
img[:, :, 0] -= 103.939
img[:, :, 1] -= 116.779
img[:, :, 2] -= 123.68
# Normalize features
img = (img.flatten() - np.mean(img)) / np.std(img)
return np.array([img])
Here's my gradient descent routine:
def fit(x, y, t1, t2):
"""Training routine"""
ils = x.shape[1] if len(x.shape) > 1 else 1
labels = len(set(y))
if t1 is None or t2 is None:
t1 = randweights(ils, 10)
t2 = randweights(10, labels)
params = np.concatenate([t1.reshape(-1), t2.reshape(-1)])
res = grad(params, ils, 10, labels, x, y)
params -= 0.1 * res
return unpack(params, ils, 10, labels)
Here are my forward and back(gradient) propagations:
def forward(x, theta1, theta2):
"""Forward propagation"""
m = x.shape[0]
# Forward prop
a1 = np.vstack((np.ones([1, m]), x.T))
z2 = np.dot(theta1, a1)
a2 = np.vstack((np.ones([1, m]), sigmoid(z2)))
a3 = sigmoid(np.dot(theta2, a2))
return (a1, a2, a3, z2, m)
def grad(params, ils, hls, labels, x, Y, lmbda=0.01):
"""Compute gradient for hypothesis Theta"""
theta1, theta2 = unpack(params, ils, hls, labels)
a1, a2, a3, z2, m = forward(x, theta1, theta2)
d3 = a3 - Y.T
print('Current error: {}'.format(np.mean(np.abs(d3))))
d2 = np.dot(theta2.T, d3) * (np.vstack([np.ones([1, m]), sigmoid_prime(z2)]))
d3 = d3.T
d2 = d2[1:, :].T
t1_grad = np.dot(d2.T, a1.T)
t2_grad = np.dot(d3.T, a2.T)
theta1[0] = np.zeros([1, theta1.shape[1]])
theta2[0] = np.zeros([1, theta2.shape[1]])
t1_grad = t1_grad + (lmbda / m) * theta1
t2_grad = t2_grad + (lmbda / m) * theta2
return np.concatenate([t1_grad.reshape(-1), t2_grad.reshape(-1)])
And here's my prediction function:
def predict(theta1, theta2, x):
"""Predict output using learned weights"""
m = x.shape[0]
h1 = sigmoid(np.hstack((np.ones([m, 1]), x)).dot(theta1.T))
h2 = sigmoid(np.hstack((np.ones([m, 1]), h1)).dot(theta2.T))
return h2.argmax(axis=1)
I can see that the error rate is gradually decreasing with each iteration, generally converging somewhere around 1.26e-05.
What I've tried so far:
Edit: An average output of h2 looks like the following:
[0.5004899 0.45264441]
[0.50048522 0.47439413]
[0.50049019 0.46557124]
[0.50049261 0.45297816]
So, very similar sigmoid outputs for all validation examples.
Neural networks work better at predictive analytics because of the hidden layers. Linear regression models use only input and output nodes to make predictions. The neural network also uses the hidden layer to make predictions more accurate. That's because it 'learns' the way a human does.
Convolutional Neural Networks, or CNNs, were designed to map image data to an output variable. They have proven so effective that they are the go-to method for any type of prediction problem involving image data as an input.
Neural network models can be configured for multi-output regression tasks.
The randomness of an artificial neural network(ANN) is when the same neural network is trained on the same data, and it produces different results.
My network does always predict the same class. What is the problem?
I had this a couple of times. Although I'm currently too lazy to go through your code, I think I can give some general hints which might also help others who have the same symptom but probably different underlying problems.
For every class i the network should be able to predict, try the following:
If this doesn't work, there are four possible error sources:
float32
but actually is an integer.See sklearn for details.
The idea is to start with a tiny training dataset (probably only one item). Then the model should be able to fit the data perfectly. If this works, you make a slightly larger dataset. Your training error should slightly go up at some point. This reveals your models capacity to model the data.
Check how often the other class(es) appear. If one class dominates the others (e.g. one class is 99.9% of the data), this is a problem. Look for "outlier detection" techniques.
0.001
is often used / working. This is also relevant if you use Adam as an optimizer.This is inspired by reddit:
imbalanced-learn
After a week and a half of research I think I understand what the issue is. There is nothing wrong with the code itself. The only two issues that prevent my implementation from classifying successfully are time spent learning and proper selection of learning rate / regularization parameters.
I've had the learning routine running for some tome now, and it's pushing 75% accuracy already, though there is still plenty of space for improvement.
Same happened to me. I had an imbalanced dataset (about 66%-33% sample distribution between classes 0 and 1, respectively) and the net was always outputting 0.0
for all samples after the first iteration.
My problem was simply a too high learning rate. Switching it to 1e-05
solved the issue.
More generally, what I suggest to do is to print, before the parameters' update:
And then check the same three items after the parameter update. What you should see in the next batch is a gradual change in the net output. When my learning rate was too high, already in the second iteration the net output would shoot to either all 1.0
s or all 0.0
s for all samples in the batch.
Same happened to me. Mine was in deeplearning4j
JAVA
library for image classification.It kept on giving the final output of the last training folder for every test. I was able to solve it by decreasing the learning rate.
Approaches can be used :
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With