Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jacobian matrix computation for artificial neural networks

Recently I started thinking about implementing Levenberg-Marquardt algorithm for learning an Artificial Neural Network (ANN). The key to the implementation is to compute a Jacobian matrix. I spent a couple hours studying the topic, but I can't figure out how to compute it exactly.

Say I have a simple feed-forward network with 3 inputs, 4 neurons in the hidden layer and 2 outputs. Layers are fully connected. I also have 5 rows long learning set.

  1. What exactly should be the size of the Jacobian matrix?
  2. What exactly should I put in place of the derivatives? (Examples of the formulas for the top-left, and bottom-right corners along with some explanation would be perfect)

This really doesn't help:

enter image description here

What are F and x in terms of a neural network?

like image 866
Andrzej Gis Avatar asked Oct 01 '14 01:10

Andrzej Gis


1 Answers

The Jacobian is a matrix of all first-order partial derivatives of a vector-valued function. In the neural network case, it is a N-by-W matrix, where N is the number of entries in our training set and W is the total number of parameters (weights + biases) of our network. It can be created by taking the partial derivatives of each output in respect to each weight, and has the form:

enter image description here

Where F(xi, w) is the network function evaluated for the i-th input vector of the training set using the weight vector w and wj is the j-th element of the weight vector w of the network. In traditional Levenberg-Marquardt implementations, the Jacobian is approximated by using finite differences. However, for neural networks, it can be computed very efficiently by using the chain rule of calculus and the first derivatives of the activation functions.

like image 188
abhinash Avatar answered Oct 17 '22 02:10

abhinash