I'm doing transfer learning using Inception on Tensorflow, this is the training code that I followed : https://raw.githubusercontent.com/tensorflow/hub/master/examples/image_retraining/retrain.py
At the bottom part of the code, we can specify the parameters according to our dataset. (there are training, val, test percentage and training, val, test batch size)
Let's say I have a very large dataset (1 mil) and I already set the training, validation, testing percentage
to 75:15:10.
But I have no idea how to set the batch parameters correctly :
For now, I set the train_batch_size
to 64, do I need to set the same value for validation_batch_size
? Or should it be bigger or smaller than the train_batch_size
?
Pilot batch size should correspond to at least 10% of the production scale batch, i.e. such that the multiplication factor for the scale-up does not exceed 10. For oral solid dosage forms this size should generally be 10% of production scale or 100,000 units whichever is the greater1.
The batch size affects some indicators such as overall training time, training time per epoch, quality of the model, and similar. Usually, we chose the batch size as a power of two, in the range between 16 and 512. But generally, the size of 32 is a rule of thumb and a good initial choice.
A commonly used ratio is 80:20, which means 80% of the data is for training and 20% for testing. Other ratios such as 70:30, 60:40, and even 50:50 are also used in practice.
In general, putting 80% of the data in the training set, 10% in the validation set, and 10% in the test set is a good split to start with.
You can follow the advice from the other answers for the dataset split ratio. However, the batch size has absolutely nothing to do with how you've split your datasets.
The batch size determines how many training examples are processed in parallel for training/inference. The batch size at training time can affect how fast and how well your training converges. You can find a discussion of this effect here. Thus, for train_batch_size
, it's worth picking a batch size that is neither too small nor too large (as discussed in the previously linked discussion). For some applications, using the largest possible training batches can actually be desirable, but in general, you select it through experiments and validation.
However, for validation_batch_size
and test_batch_size
, you should pick the largest batch size that your hardware can handle without running out of memory and crashing. Finding this is usually a simple trial and error process. The larger your batch size at inference time, the faster it will be, since more inputs can be processed in parallel.
EDIT: Here's an additional useful link (Pg. 276) for the training batch size trade-off from Goodfellow et al's deep learning book.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With