Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why tensorflow GPU memory usage decreasing when I increasing the batch size?

Recently I implemented a VGG-16 network using both Tensorflow and PyTorch, data set is CIFAR-10. Each picture is 32 * 32 RGB.

I use a 64 batch size in beginning, while I found PyTorch using much less GPU memory than tensorflow. Then I did some experiments and got a figure, which is posted below.

Figure : GPU memory use compare

After some researching, I known the tensorflow using BFC algorithm to manage memory. So it's can explain why tensorflow's memory using decreasing or increasing by 2048, 1024, ... MB and sometimes the memory use not increasing when batch size is bigger.

But I am still confused, why the memory use is lower when batch size is 512 than batch size is 384, 448 etc. which has a smaller batch size. The same as when batch size is from 1024 to 1408, and batch size is 2048 to 2688.

Here is my source code:

PyTorch:https://github.com/liupeng3425/tesorflow-vgg/blob/master/vgg-16-pytorch.py

Tensorflow:https://github.com/liupeng3425/tesorflow-vgg/blob/master/vgg-16.py


edit: I have two Titan XP on my computer, OS: Linux Mint 18.2 64-bit.

I determine GPU memory usage with command nvidia-smi.

My code runs on GPU1, which is defined in my code:

os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "1"

And I am sure there only one application using GPU1.

GPU memory usage can be determined by the application list below. For example, like the posted screen shot below, process name is /usr/bin/python3 and its GPU memory usage is 1563 MiB.

Figure : terminal nvidia-smi

like image 517
lp5 Avatar asked Nov 27 '17 06:11

lp5


People also ask

Does batch size affect GPU memory?

Generally speaking, memory consumption increases with higher batch size values. The framework used, the parameters of the model and the model itself as well as each batch of data affect memory usage.

Is batch a GPU size?

In the case of multiple GPUs, the rule of thumb will be using at least 16 (or so) batch size per GPU, given that, if you are using 4 or 8 batch size, the GPU cannot be completely utilized to train the model. For multiple GPU, there might be a slight difference due to precision error.


1 Answers

As noted in the comments, by default TensorFlow always takes up all memory on a GPU. I assume you have disabled that function for this test, but it does show that the algorithms do not generally attempt to minimize the memory that is reserved, even if it's not all utilized in the calculations.

To find the optimal configuration for your device and code, TensorFlow often runs (parts of) the first calculation multiple times. I suspect that this included settings for pre-loading data onto the GPU. This would mean that the numbers you see happen to be the optimal values for your device and configuration.

Since TensorFlow doesn't mind using more memory, 'optimal' here is measured by speed, not memory usage.

like image 123
Edwin van der Helm Avatar answered Sep 17 '22 08:09

Edwin van der Helm