TensorFlow efficient shared memory allocation for recursive concatenation

Question

DenseNets tend to take up a lot of memory in TensorFlow because each concat operation is stored in a separate allocation. A recent paper, Memory-Efficient Implementation of DenseNets, demonstrates that this memory utilization can be dramatically reduced through sharing of allocations. This image from the paper + pytorch implementation illustrates the shared memory approach:

densenet shared memory

How can this be implemented with TensorFlow? If it can't be done via python, how can it be properly implemented in an Op with CPU and GPU support?

Pytorch efficient DenseNet implementation
Keras DenseNet Implementation with "naive" allocations, works with TensorFlow backend.

I've created a TensorFlow Feature Request for necessary allocation functionality.

Andrew Hundt · Accepted Answer

A memory efficient implementation is now available at:

https://github.com/joeyearsley/efficient_densenet_tensorflow

The relevant function from the above link is:

# Gradient checkpoint the layer
_x = tf.contrib.layers.recompute_grad(_x)

TensorFlow efficient shared memory allocation for recursive concatenation

Tags:

c++

python

memory-management

tensorflow

tensorflow-gpu

Andrew Hundt

1 Answers

Andrew Hundt

Recent Activity

Donate For Us

TensorFlow efficient shared memory allocation for recursive concatenation

Tags:

c++

python

memory-management

tensorflow

tensorflow-gpu

Andrew Hundt

1 Answers

Andrew Hundt

Related questions

Recent Activity

Donate For Us