Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get detailed memory breakdown in the TensorFlow profiler?

I'm using the new TensorFlow profiler to profile memory usage in my neural net, which I'm running on a Titan X GPU with 12GB RAM. Here's some example output when I profile my main training loop:

==================Model Analysis Report======================
node name | requested bytes | ...
Conv2DBackpropInput        10227.69MB (100.00%, 35.34%),     ...
Conv2D                       9679.95MB (64.66%, 33.45%),     ...
Conv2DBackpropFilter         8073.89MB (31.21%, 27.90%),     ...

Obviously this adds up to more than 12GB, so some of these matrices must be in main memory while others are on the GPU. I'd love to see a detailed breakdown of what variables are where at a given step. Is it possible to get more detailed information on where various parameters are stored (main or GPU memory), either with the profiler or otherwise?

like image 264
Emma Strubell Avatar asked Oct 26 '17 18:10

Emma Strubell


1 Answers

"Requested bytes" shows a sum over all memory allocations, but that memory can be allocated and de-allocated. So just because "requested bytes" exceeds GPU RAM doesn't necessarily mean that memory is being transferred to CPU.

In particular, for a feedforward neural network, TF will normally keep around the forward activations, to make backprop efficient, but doesn't need to keep the intermediate backprop activations, i.e. dL/dh at each layer, so it can just throw away these intermediates after it's done with these. So I think in this case what you care about is the memory used by Conv2D, which is less than 12 GB.

You can also use the timeline to verify that total memory usage never exceeds 12 GB.

like image 173
knightian Avatar answered Sep 30 '22 16:09

knightian