I am training deep neural networks with a GPU. If I make samples too large, batches too large, or networks too deep, I get an out of memory error. In this case, it is sometimes possible to make smaller batches and still train.
Is it possible to calculate GPU size required for training and determine what batch size to choose beforehand?
UPDATE
If I print network summary, it displays number of "trainable parameters". Can't I estimate from this value? For example, take this, multiply by batch size, double for gradients etc?
In practical terms, to determine the optimum batch size, we recommend trying smaller batch sizes first(usually 32 or 64), also keeping in mind that small batch sizes require small learning rates. The number of batch sizes should be a power of 2 to take full advantage of the GPUs processing.
If the batch size is very small, there is relatively a lot of overhead in firing up the GPU and waiting for the results. With a larger batch size, that overhead still exists but is now amortized (divided) over more examples and so you spend less time waiting. Ideally, your GPU should be 100% busy when training.
The batch setup cost is computed simply by amortizing that cost over the batch size. Batch size of one means total cost for that one item. Batch size of ten, means that setup cost is 1/10 per item (ten times less).
The batch size depends on the size of the images in your dataset; you must select the batch size as much as your GPU ram can hold. Also, the number of batch size should be chosen not very much and not very low and in a way that almost the same number of images remain in every step of an epoch.
No, it is not possible to do this automatically. So you need to go through a lot of trial and error to find appropriate size if you want your batch to be as much as possible.
Stanford's CNN class provides some guidance how to estimate the memory size, but all suggestions are related to CNN (not sure what do you train).
PyTorch Lightning recently added a feature called "auto batch size", especially for this! It computes the max batch size that can fit into the memory of your GPU :)
More info can be found here.
Original PR: https://github.com/PyTorchLightning/pytorch-lightning/pull/1638
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With