Recently I started thinking about implementing Levenberg-Marquardt algorithm for learning an Artificial Neural Network (ANN). The key to the implementation is to compute a Jacobian matrix. I spent a couple hours studying the topic, but I can't figure out how to compute it exactly.
Say I have a simple feed-forward network with 3 inputs, 4 neurons in the hidden layer and 2 outputs. Layers are fully connected. I also have 5 rows long learning set.
This really doesn't help:
What are F and x in terms of a neural network?
The Jacobian is a matrix of all first-order partial derivatives of a vector-valued function. In the neural network case, it is a N-by-W matrix, where N is the number of entries in our training set and W is the total number of parameters (weights + biases) of our network. It can be created by taking the partial derivatives of each output in respect to each weight, and has the form:
Where F(xi, w) is the network function evaluated for the i-th input vector of the training set using the weight vector w and wj is the j-th element of the weight vector w of the network. In traditional Levenberg-Marquardt implementations, the Jacobian is approximated by using finite differences. However, for neural networks, it can be computed very efficiently by using the chain rule of calculus and the first derivatives of the activation functions.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With