Tensorflow dense gradient explanation?

Tags:

tensorflow

I recently implemented a model and when I ran it I received this warning:

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. 
This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

With some similar parameter settings (embedding dimensionalities) suddenly the model is ridiculously slow.

What does this warning imply? It appears that something I've done has caused all of the gradients to be dense and so backprop is doing dense matrix computations
If it's that there is an issue with the model that's causing this, how can I identify it and fix it?

721

asked Mar 09 '16 13:03

3 Answers

This warning is printed when a sparse tf.IndexedSlices object is implicitly converted to a dense tf.Tensor. This typically happens when one op (usually tf.gather()) backpropagates a sparse gradient, but the op that receives it does not have a specialized gradient function that can handle sparse gradients. As a result, TensorFlow automatically densifies the tf.IndexedSlices, which can have a devastating effect on performance if the tensor is large.

To fix this problem, you should try to ensure that the params input to tf.gather() (or the params inputs to tf.nn.embedding_lookup()) is a tf.Variable. Variables can receive the sparse updates directly, so no conversion is needed. Although tf.gather() (and tf.nn.embedding_lookup()) accept arbitrary tensors as inputs, this may lead to a more complicated backpropagation graph, resulting in implicit conversion.

184

answered Oct 10 '22 02:10

mrry

A dense Tensor can be thought of like a standard python array. A sparse one can be thought of as a collection of indices and values e.g.

# dense
array = ['a', None, None, 'c']

# sparse
array = [(0, 'a'), (3, 'c')]

So as you can see if you have a lot of empty entries a sparse array will be much more efficient than a dense one. But if all entries are filled in, dense is far more efficient. In your case somewhere in the tensor flow graph a sparse array is being converted to a dense one of indeterminate size. The warning is just saying it is possible that you can waste a lot of memory like this. But it might not be a problem at all if the sparse array is not too big/already quite dense.

If you want to diagnose it I would advise naming your various tensor objects then it will print exactly which ones are being used in this conversion and you can work out what you might be able to adjust to remove it.

answered Oct 10 '22 02:10

Daniel Slater

Totally agree with the answer of mrry.

Actually I will post another solution for this problem.

You could use tf.dynamic_partition() instead of tf.gather() to eliminate the warning.

The example code is below:

# Create the cells for the RNN network
lstm = tf.nn.rnn_cell.BasicLSTMCell(128)

# Get the output and state from dynamic rnn
output, state = tf.nn.dynamic_rnn(lstm, sequence, dtype=tf.float32, sequence_length = seqlen)

# Convert output to a tessor and reshape it
outputs = tf.reshape(tf.pack(output), [-1, lstm.output_size])

# Set partions to 2
num_partitions = 2

# The partitions argument is a tensor which is already fed to a placeholder.
# It is a 1-D tensor with the length of batch_size * max_sequence_length.
# In this partitions tensor, you need to set the last output idx for each seq to 1 and 
# others remain 0, so that the result could be separated to two parts,
# one is the last outputs and the other one is the non-last outputs.
res_out = tf.dynamic_partition(outputs, partitions, num_partitions)

# prediction
preds = tf.matmul(res_out[1], weights) + bias

Hope this could help you.

answered Oct 10 '22 02:10

AI_ROBOT

Related questions
                            
                                How to check if keras tensorflow backend is GPU or CPU version? [duplicate]
                            
                                How do I convert a directory of jpeg images to TFRecords file in tensorflow?
                            
                                Is Tensorflow compatible with a Windows workflow?
                            
                                Tensorflow Tensorboard default port
                            
                                WARNING:tensorflow:sample_weight modes were coerced from ... to ['...']
                            
                                How to define max_queue_size, workers and use_multiprocessing in keras fit_generator()?
                            
                                Tensorflow: None of the MLIR optimization passes are enabled (registered 1)
                            
                                Adjust Single Value within Tensor -- TensorFlow
                            
                                When importing tensorflow, I get the following error: No module named 'numpy.core._multiarray_umath'
                            
                                Unbalanced data and weighted cross entropy
                            
                                TensorFlow - Importing data from a TensorBoard TFEvent file?
                            
                                Keras - Difference between categorical_accuracy and sparse_categorical_accuracy
                            
                                How to approach a number guessing game (with a twist) algorithm?
                            
                                Tensorflow vs OpenCV [closed]
                            
                                Convert Keras model to C++ [closed]
                            
                                List of tensor names in graph in Tensorflow
                            
                                tf.nn.conv2d vs tf.layers.conv2d
                            
                                Why the 6 in relu6?
                            
                                "synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'." problem in TensorFlow
                            
                                How do I check if keras is using gpu version of tensorflow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow dense gradient explanation?

Tags:

tensorflow

Taaam

People also ask

3 Answers

mrry

Daniel Slater

AI_ROBOT

Recent Activity

Donate For Us