Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow efficient shared memory allocation for recursive concatenation

DenseNets tend to take up a lot of memory in TensorFlow because each concat operation is stored in a separate allocation. A recent paper, Memory-Efficient Implementation of DenseNets, demonstrates that this memory utilization can be dramatically reduced through sharing of allocations. This image from the paper + pytorch implementation illustrates the shared memory approach:

densenet shared memory

How can this be implemented with TensorFlow? If it can't be done via python, how can it be properly implemented in an Op with CPU and GPU support?

  • Pytorch efficient DenseNet implementation
  • Keras DenseNet Implementation with "naive" allocations, works with TensorFlow backend.

I've created a TensorFlow Feature Request for necessary allocation functionality.

like image 564
Andrew Hundt Avatar asked Sep 08 '17 21:09

Andrew Hundt


1 Answers

A memory efficient implementation is now available at:

https://github.com/joeyearsley/efficient_densenet_tensorflow

The relevant function from the above link is:

# Gradient checkpoint the layer
_x = tf.contrib.layers.recompute_grad(_x)
like image 83
Andrew Hundt Avatar answered Oct 14 '22 07:10

Andrew Hundt