How I reduce memory consumption in a loop in TensorFlow?

Tags:

I have a loop in TensorFlow that looks like this:

with tf.device("/gpu:1"):
    losses = []

    for target, output in zip(targets, lstm_outputs):
        logits = tf.matmul(W, output) + b
        loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, target)
        losses.append(loss)

    total_loss = tf.add_n(losses)

I am getting an OOM error when allocating the gradients for this layer, since each matrix multiplication is a different operation in the graph taking memory. Is there a way of preventing TensorFlow from allocating all these operations at the same time?

673

asked Mar 24 '16 06:03

Maarten

1 Answers

This is a challenging graph for TensorFlow to optimize, since the activations from each layer must be kept to aggregate a single gradient for W. One possibility is to pass the experimental aggregation_method argument when calling optimizer.optimize().

For example, you could try the following:

optimizer = tf.train.AdagradOptimizer(...)  # Or another optimization algorithm.
train_op = optimizer.minimize(
    total_loss,
    aggregation_method=tf.AggregationMethod.EXPERIMENTAL_ACCUMULATE_N)

This option eagerly aggregates the gradients for recurrently-used variables in place, rather than keeping them all in memory until all of the gradients have been computed. If this doesn't work, the tf.AggregationMethod.EXPERIMENTAL_TREE may work better.

166

answered Sep 23 '22 13:09

mrry

Related questions
                            
                                unittest is not able to discover / run tests
                            
                                Finding if any element in a list is in another list and return the first element found
                            
                                Python: How to share import between modules?
                            
                                PostgreSQL unable to create plpythonu extension
                            
                                Python detect keywords
                            
                                Pythonic way of generating data outside of a method
                            
                                Python private instance data revisited
                            
                                Django rest framework unit test viewsets mixins
                            
                                Fatal error when trying to install PyCrypto on OS X El Capitan
                            
                                dask computation not executing in parallel
                            
                                How to install mypy-lang on python 2.7?
                            
                                Python async websocket client with async timer
                            
                                What does `Fatal Python error: PyThreadState_Get: no current thread` mean?
                            
                                Make function have ellipsis for arguments in help() function
                            
                                Can I Use Python in Ionic for Backend work
                            
                                Sklearn AffinityPropagation MemoryError
                            
                                DRF - make an resource alias /me for current user
                            
                                Django: one-to-one field with factory_boy: UNIQUE constraint failed
                            
                                typing.NamedTuple and PyCharm
                            
                                For python, install hdf5/netcdf4

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How I reduce memory consumption in a loop in TensorFlow?

Tags:

python

tensorflow

gpu

Maarten

People also ask

1 Answers

mrry

Recent Activity

Donate For Us