What is the reason of taking batch size or number of neurons as a power of 2 in neural networks?

Question

I have seen many tutorials doing this and myself too have been adhering to this standard practice.

When it comes to batch size of a training data, we assign any value in geometric progression starting with 2 like 2,4,8,16,32,64.

Even when selecting the number of neurons in the hidden layers, we assign it the same way. Either of these - 2,4,8,16,32,64,128,256,512,...

What is the core reason behind this? Why does the neural network performs better doing this?

Ash · Accepted Answer

If you use NVIDIA GPUs (the most popular choice for deep learning), the choice of channel size for convolutions and fully-connected layers mostly has to do with enabling Tensor cores, which as the name implies are used for efficient Tensor and matrix operations (and therefore for convolutions). To quote the NVIDIA guide on performance for deep learning:

Choose the number of input and output channels to be divisible by 8 to enable Tensor Cores

Similar guidelines are given regarding batch size, however the reason for those is quantization.

Choose the number of input and output channels to be divisible by 8 to enable Tensor Cores

Similar guidelines are given regarding batch size, however the reason for those is quantization.

What is the reason of taking batch size or number of neurons as a power of 2 in neural networks?

Tags:

machine-learning

neural-network

tensorflow

deep-learning

keras

Asutosh Panda

1 Answers

Ash

Recent Activity

Donate For Us

What is the reason of taking batch size or number of neurons as a power of 2 in neural networks?

Tags:

machine-learning

neural-network

tensorflow

deep-learning

keras

Asutosh Panda

1 Answers

Ash

Related questions

Recent Activity

Donate For Us