Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow: How to measure how much GPU memory each tensor takes?

Tags:

tensorflow

I'm currently implementing YOLO in TensorFlow and I'm a little surprised on how much memory that is taking. On my GPU I can train YOLO using their Darknet framework with batch size 64. On TensorFlow I can only do it with batch size 6, with 8 I already run out of memory. For the test phase I can run with batch size 64 without running out of memory.

  1. I am wondering how I can calculate how much memory is being consumed by each tensor? Are all tensors by default saved in the GPU? Can I simply calculate the total memory consumption as the shape * 32 bits?

  2. I noticed that since I'm using momentum, all my tensors also have a /Momentum tensor. Could that also be using a lot of memory?

  3. I am augmenting my dataset with a method distorted_inputs, very similar to the one defined in the CIFAR-10 tutorial. Could it be that this part is occupying a huge chunk of memory? I believe Darknet does the modifications in the CPU.

like image 552
Clash Avatar asked Mar 31 '16 10:03

Clash


Video Answer


2 Answers

Now that 1258 has been closed, you can enable memory logging in Python by setting an environment variable before importing TensorFlow:

import os
os.environ['TF_CPP_MIN_VLOG_LEVEL']='3'
import tensorflow as tf

There will be a lot of logging as a result of this. You'll want to grep the results to find the appropriate lines. For example:

grep MemoryLogTensorAllocation train.log
like image 95
Erik Shilts Avatar answered Sep 17 '22 14:09

Erik Shilts


Sorry for the slow reply. Unfortunately right now the only way to set the log level is to edit tensorflow/core/platform/logging.h and recompile with e.g.

#define VLOG_IS_ON(lvl) ((lvl) <= 1)

There is a bug open 1258 to control logging more elegantly.

MemoryLogTensorOutput entries are logged at the end of each Op execution, and indicate the tensors that hold the outputs of the Op. It's useful to know these tensors since the memory is not released until the downstream Op consumes the tensors, which may be much later on in a large graph.

like image 43
Michael Isard Avatar answered Sep 21 '22 14:09

Michael Isard