Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In TensorFlow, what is tf.identity used for?

I've seen tf.identity used in a few places, such as the official CIFAR-10 tutorial and the batch-normalization implementation on stackoverflow, but I don't see why it's necessary.

What's it used for? Can anyone give a use case or two?

One proposed answer is that it can be used for transfer between the CPU and GPU. This is not clear to me. Extension to the question, based on this: loss = tower_loss(scope) is under the GPU block, which suggests to me that all operators defined in tower_loss are mapped to the GPU. Then, at the end of tower_loss, we see total_loss = tf.identity(total_loss) before it's returned. Why? What would be the flaw with not using tf.identity here?

like image 866
rd11 Avatar asked Jan 19 '16 13:01

rd11


People also ask

What is TF function in Tensorflow?

tf. function is a decorator function provided by Tensorflow 2.0 that converts regular python code to a callable Tensorflow graph function, which is usually more performant and python independent. It is used to create portable Tensorflow models.

What is the difference between TF tensor and TF variable?

Tensors v.s. Variables A variable in Tensorflow is also a wrapper around a tensor, but has a different meaning. A variable contains a tensor that is persistent and changeable across different Session.

What is TF Where?

If x and y are not provided (both are None): tf. where will return the indices of condition that are non-zero, in the form of a 2-D tensor with shape [n, d] , where n is the number of non-zero elements in condition ( tf. count_nonzero(condition) ), and d is the number of axes of condition ( tf. rank(condition) ).

What does the axis parameter of TF Expand_dims do?

expand_dims() is used to insert an addition dimension in input Tensor. Parameters: input: It is the input Tensor. axis: It defines the index at which dimension should be inserted.


2 Answers

After some stumbling I think I've noticed a single use case that fits all the examples I've seen. If there are other use cases, please elaborate with an example.

Use case:

Suppose you'd like to run an operator every time a particular Variable is evaluated. For example, say you'd like to add one to x every time the variable y is evaluated. It might seem like this will work:

x = tf.Variable(0.0) x_plus_1 = tf.assign_add(x, 1)  with tf.control_dependencies([x_plus_1]):     y = x init = tf.initialize_all_variables()  with tf.Session() as session:     init.run()     for i in xrange(5):         print(y.eval()) 

It doesn't: it'll print 0, 0, 0, 0, 0. Instead, it seems that we need to add a new node to the graph within the control_dependencies block. So we use this trick:

x = tf.Variable(0.0) x_plus_1 = tf.assign_add(x, 1)  with tf.control_dependencies([x_plus_1]):     y = tf.identity(x) init = tf.initialize_all_variables()  with tf.Session() as session:     init.run()     for i in xrange(5):         print(y.eval()) 

This works: it prints 1, 2, 3, 4, 5.

If in the CIFAR-10 tutorial we dropped tf.identity, then loss_averages_op would never run.

like image 134
rd11 Avatar answered Oct 29 '22 15:10

rd11


tf.identity is useful when you want to explicitly transport tensor between devices (like, from GPU to a CPU). The op adds send/recv nodes to the graph, which make a copy when the devices of the input and the output are different.

A default behavior is that the send/recv nodes are added implicitly when the operation happens on a different device but you can imagine some situations (especially in a multi-threaded/distributed settings) when it might be useful to fetch the value of the variable multiple times within a single execution of the session.run. tf.identity allows for more control with regard to when the value should be read from the source device. Possibly a more appropriate name for this op would be read.

Also, please note that in the implementation of tf.Variable link, the identity op is added in the constructor, which makes sure that all the accesses to the variable copy the data from the source only once. Multiple copies can be expensive in cases when the variable lives on a GPU but it is read by multiple CPU ops (or the other way around). Users can change the behavior with multiple calls to tf.identity when desired.

EDIT: Updated answer after the question was edited.

In addition, tf.identity can be used used as a dummy node to update a reference to the tensor. This is useful with various control flow ops. In the CIFAR case we want to enforce that the ExponentialMovingAverageOp will update relevant variables before retrieving the value of the loss. This can be implemented as:

with tf.control_dependencies([loss_averages_op]):   total_loss = tf.identity(total_loss) 

Here, the tf.identity doesn't do anything useful aside of marking the total_loss tensor to be ran after evaluating loss_averages_op.

like image 38
Rafał Józefowicz Avatar answered Oct 29 '22 15:10

Rafał Józefowicz