Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neural Net - Selecting Data For Each Mini Batch

Possibly an ANN 101 question regarding minim batch processing. Google didn't seem to have the answer. A search here didn't yield anything either. My guess is there's a book somewhere that says, "do it this way!" and I just haven't read that book.

I'm coding a neural net in Python (not that the language matters). I'm attempting to add mini-batch updates instead of full batch. Is it necessary to select each observation once for each epoch? Mini-batches would be data values 1:10, 11:20, 21:30, etc. so that all observations are used, and they are all used once.

Or is it correct to select the mini batch randomly from the training data set based on a probability? The result being that each observation may be used once, multiple times, or not at all in any given epoch. For 20 mini-batches per epoch, each data element would be given a 5% chance of being selected for any given mini-batch. Mini batches would be randomly selected and random in size but approximately 1 of every 20 data points would be included in each of 20 mini batches with no guarantee of selection.

like image 317
user791770 Avatar asked Dec 04 '12 00:12

user791770


2 Answers

Some tips regarding mini-batch training:

Shuffle your samples before every epoch

The reason is the same as why you shuffle the samples in online training: Otherwise the network might simply memorize the order in which you feed the samples.

Use a fixed batch size for every batch and for every epoch

There is probably also a statistical reason, but it simplifies the implementation as it enables you to use fast implementations of matrix multiplications for your calculations. (e.g. BLAS)

Adapt your learning rate to the batch size

For larger batches you'll have to use a smaller learning rate, otherwise the ANN tends to converge towards a sub-optimal minimum. I always scaled my learning rates by 1/sqrt(n), where n is the batch size. Please note that this is just an empirical value from experiments.

like image 153
Domderon Avatar answered Sep 28 '22 03:09

Domderon


Your first guess is correct. Just randomize your dataset first. Then for (say) a 20 mini-batch. Use: 1-20, then 21-40, etc... So, all your dataset will be used.

Ben don't say that the data set are only used once. You normally need to do multiple epochs on all the dataset for your network to learn properly.

Mini-batch is primarily use to speed up the learning process.

like image 30
ThiS Avatar answered Sep 28 '22 01:09

ThiS