Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the purpose of graph collections in TensorFlow?

The API discusses Graph Collections which judging from the code are a general purpose key/data storage. What is the purpose of those collections?

like image 736
Andrzej Pronobis Avatar asked Dec 12 '15 02:12

Andrzej Pronobis


People also ask

What are graphs in TensorFlow?

Graphs are data structures that contain a set of tf. Operation objects, which represent units of computation; and tf. Tensor objects, which represent the units of data that flow between operations. They are defined in a tf. Graph context.

Why we are using computational graph for TensorFlow?

Tensorflow uses dataflow graph to represent computation in terms of the dependencies between individual operations. This leads to a low-level programming model in which one defines the dataflow graph, then creates a TensorFlow session to run parts of the graph across a set of local and remote devices.

What is a data flow graph TensorFlow?

A dataflow graph is the representation of a computation where the nodes represent units of computation, and the edges represent the data consumed or produced by the computation. In the context of tf. Graph , every API call defines tf. Operation (node) that can have multiple inputs and outputs tf.

What is graph and session in TensorFlow?

It's simple: A graph defines the computation. It doesn't compute anything, it doesn't hold any values, it just defines the operations that you specified in your code. A session allows to execute graphs or part of graphs.


2 Answers

Remember that under the hood, Tensorflow is a system for specifying and then executing computational data flow graphs. The graph collections are used as part of keeping track of the constructed graphs and how they must be executed. For example, when you create certain kinds of ops, such as tf.train.batch_join, the code that adds the op will also add some queue runners to the QUEUE_RUNNERS graph collection. Later, when you call start_queue_runners(), by default, it will look at the QUEUE_RUNNERS collection to know which runners to start.

like image 191
dga Avatar answered Jan 03 '23 17:01

dga


I think there are at least two benefits for me so far:

  1. when you distribute your program on multiple GPUs or machines it is convenient to collect losses from different devices which are in the same collection. Use the tf.add_n to add them to accumulate the loss.
  2. To update a particular set of variables like weights and biases in my own way.

For instance:

import tensorflow as tf    
w = tf.Variable([1,2,3], collections=[tf.GraphKeys.WEIGHTS], dtype=tf.float32)    
w2 = tf.Variable([11,22,32], collections=[tf.GraphKeys.WEIGHTS], dtype=tf.float32)
weight_init_op = tf.variables_initializer(tf.get_collection_ref(tf.GraphKeys.WEIGHTS))
sess = tf.InteractiveSession()
weight_init_op.run()
for vari in tf.get_collection_ref(tf.GraphKeys.WEIGHTS): 
    tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, vari.assign(0.2 * vari))
weight_update_ops = tf.get_collection_ref(tf.GraphKeys.UPDATE_OPS)
for op in weight_update_ops:
    print(op.eval())

The output:

[0.2 0.4 0.6]
[2.2 4.4 6.4]
like image 29
Lerner Zhang Avatar answered Jan 03 '23 18:01

Lerner Zhang