Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I apply multithreading to the backpropagation neural network training?

For my university project I am creating a neural network that can classify the likelihood that a credit card transaction is fraudulent or not. I am training with backpropagation. I am writing this in Java. I would like to apply multithreading, because my computer is a quad-core i7. It bugs me to spend hours training and see most of my cores idle.

But how would I apply multithreading to backpropagation? Backprop works by adjusting the errors backwards through the network. One layer must be done before the other can continue. Is there any way that I can modify my program to do multicore backdrop?

like image 858
Miley Avatar asked Dec 02 '09 02:12

Miley


People also ask

How is the training algorithm performed in backpropagation neural network?

The algorithm is used to effectively train a neural network through a method called chain rule. In simple terms, after each forward pass through a network, backpropagation performs a backward pass while adjusting the model's parameters (weights and biases).

What type of learning is used in backpropagation neural network?

Backpropagation algorithms are used extensively to train feedforward neural networks in areas such as deep learning. They efficiently compute the gradient of the loss function with respect to the network weights.


1 Answers

First of all don't use backpropagation. There are many other options out there. I would suggest trying RPROP (resilient propagation). It won't be that big of modification to your backpropagation algorithm. You do not need to specify learning rate or momentum. Its really almost as if you have an individual, variable, learning rate for every connection in the neural network.

As to applying multithreading to backpropagation. I just wrote an article on this topic.

http://www.heatonresearch.com/encog/mprop/compare.html

Basically I create a number of threads and divide up the training data so each thread has a near equal amount. I am calculating the gradients in each thread and they are summed in a reduce step. How the gradients are applied to the weights depends on the propagation training algorithm used, but the weight update is done in a critical section.

When you have considerably more training samples than weights the code spends much more time in the multi-threaded gradient calculation than the critical section weight update.

I provide some of the performance results at the above link. It does really speed things up!

like image 100
JeffHeaton Avatar answered Oct 11 '22 13:10

JeffHeaton