Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Implement Cost Function of Neural Network (Week #5 Coursera) using Python

Based on the Coursera Course for Machine Learning, I'm trying to implement the cost function for a neural network in python. There is a question similar to this one -- with an accepted answer -- but the code in that answers is written in octave. Not to be lazy, I have tried to adapt the relevant concepts of the answer to my case, and as far as I can tell, I'm implementing the function correctly. The cost I output differs from the expected cost, however, so I'm doing something wrong.

Here's a small reproducible example:

The following link leads to an .npz file which can be loaded (as below) to obtain relevant data. Rename the file "arrays.npz" please, if you use it.

http://www.filedropper.com/arrays_1

if __name__ == "__main__":

with np.load("arrays.npz") as data:

    thrLayer = data['thrLayer'] # The final layer post activation; you 
    # can derive this final layer, if verification needed, using weights below

    thetaO = data['thetaO'] # The weight array between layers 1 and 2
    thetaT = data['thetaT'] # The weight array between layers 2 and 3

    Ynew = data['Ynew'] # The output array with a 1 in position i and 0s elsewhere

    #class i is the class that the data described by X[i,:] belongs to

    X = data['X'] #Raw data with 1s appended to the first column
    Y = data['Y'] #One dimensional column vector; entry i contains the class of entry i



import numpy as np

m = len(thrLayer)
k = thrLayer.shape[1]
cost = 0

for i in range(m):
    for j in range(k):
        cost += -Ynew[i,j]*np.log(thrLayer[i,j]) - (1 - Ynew[i,j])*np.log(1 - thrLayer[i,j])
print(cost)
cost /= m

'''
Regularized Cost Component
'''

regCost = 0

for i in range(len(thetaO)):
    for j in range(1,len(thetaO[0])):
        regCost += thetaO[i,j]**2

for i in range(len(thetaT)):
    for j in range(1,len(thetaT[0])):
        regCost += thetaT[i,j]**2

regCost *= lam/(2*m) 


print(cost)
print(regCost)

In actuality, cost should be 0.287629 and cost + newCost should be 0.383770.

This is the cost function posted in the question above, for reference:


enter image description here

like image 241
Muno Avatar asked Oct 18 '22 04:10

Muno


1 Answers

The problem is that you are using the wrong class labels. When computing the cost function, you need to use the ground truth, or the true class labels.

I'm not sure what your Ynew array, was, but it wasn't the training outputs. So, I changed your code to use Y for the class labels in the place of Ynew, and got the correct cost.

import numpy as np

with np.load("arrays.npz") as data:

    thrLayer = data['thrLayer'] # The final layer post activation; you
    # can derive this final layer, if verification needed, using weights below

    thetaO = data['thetaO'] # The weight array between layers 1 and 2
    thetaT = data['thetaT'] # The weight array between layers 2 and 3

    Ynew = data['Ynew'] # The output array with a 1 in position i and 0s elsewhere

    #class i is the class that the data described by X[i,:] belongs to

    X = data['X'] #Raw data with 1s appended to the first column
    Y = data['Y'] #One dimensional column vector; entry i contains the class of entry i


m = len(thrLayer)
k = thrLayer.shape[1]
cost = 0

Y_arr = np.zeros(Ynew.shape)
for i in xrange(m):
    Y_arr[i,int(Y[i,0])-1] = 1

for i in range(m):
    for j in range(k):
        cost += -Y_arr[i,j]*np.log(thrLayer[i,j]) - (1 - Y_arr[i,j])*np.log(1 - thrLayer[i,j])
cost /= m

'''
Regularized Cost Component
'''

regCost = 0

for i in range(len(thetaO)):
    for j in range(1,len(thetaO[0])):
        regCost += thetaO[i,j]**2

for i in range(len(thetaT)):
    for j in range(1,len(thetaT[0])):
        regCost += thetaT[i,j]**2
lam=1
regCost *= lam/(2.*m)


print(cost)
print(cost + regCost)

This outputs:

0.287629165161
0.383769859091

Edit: Fixed an integer division error with regCost *= lam/(2*m) that was zeroing out the regCost.

like image 87
bpachev Avatar answered Oct 20 '22 22:10

bpachev