I explore how is variable represented in graph. I create a variable, initialize it and make graph snapshots after every action:
import tensorflow as tf
def dump_graph(g, filename):
with open(filename, 'w') as f:
print(g.as_graph_def(), file=f)
g = tf.get_default_graph()
var = tf.Variable(2)
dump_graph(g, 'data/after_var_creation.graph')
init = tf.global_variables_initializer()
dump_graph(g, 'data/after_initializer_creation.graph')
with tf.Session() as sess:
sess.run(init)
dump_graph(g, 'data/after_initializer_run.graph')
Graph after variable creation looks like
node {
name: "Variable/initial_value"
op: "Const"
attr {
key: "dtype"
value {
type: DT_INT32
}
}
attr {
key: "value"
value {
tensor {
dtype: DT_INT32
tensor_shape {
}
int_val: 2
}
}
}
}
node {
name: "Variable"
op: "VariableV2"
attr {
key: "container"
value {
s: ""
}
}
attr {
key: "dtype"
value {
type: DT_INT32
}
}
attr {
key: "shape"
value {
shape {
}
}
}
attr {
key: "shared_name"
value {
s: ""
}
}
}
node {
name: "Variable/Assign"
op: "Assign"
input: "Variable"
input: "Variable/initial_value"
attr {
key: "T"
value {
type: DT_INT32
}
}
attr {
key: "_class"
value {
list {
s: "loc:@Variable"
}
}
}
attr {
key: "use_locking"
value {
b: true
}
}
attr {
key: "validate_shape"
value {
b: true
}
}
}
node {
name: "Variable/read"
op: "Identity"
input: "Variable"
attr {
key: "T"
value {
type: DT_INT32
}
}
attr {
key: "_class"
value {
list {
s: "loc:@Variable"
}
}
}
}
versions {
producer: 21
}
There are several nodes: Variable/initial_value
, Variable
,
Variable/Assign
, Variable/read
.
After running init operation, another node is added:
node {
name: "init"
op: "NoOp"
input: "^Variable/Assign"
}
I do not have tight grasp of what happens here.
Variable/read
node and
^Variable/Assign
inside init
node?session.run()
substitue somewhere for
this value, but do not know the gory details.The implementation of TensorFlow's tf.Variable
class can be found in the source repository here. The Python wrapper class is responsible for creating several nodes in the dataflow graph, and providing convenient accessors for using them. I'll use the names from your example to make things clear:
Node Variable
of type VariableV2
is the stateful TensorFlow op that owns the memory for the variable. Every time you run that op, it emits the buffer (as a "ref tensor") so that other ops can read or write it.
Node Variable/initial_value
(of type Const
) is the tensor that you provided as the initial_value
argument of the tf.Variable
constructor. This can be any type of tensor, although commonly it's a tf.random_*()
op used for random weight initialization. The suffix initial_value
implies that it was probably created by passing a non-tensor that was implicitly converted to a tensor.
Node Variable/Assign
of type Assign
is the initializer operation that writes the initial value into the variable's memory. It is typically run once, when you do sess.run(tf.global_variables_initializer())
later in your program.
Node Variable/read
of type Identity
is an operation that "dereferences" the Variable
op's "ref tensor" output. This is mostly an implementation detail, but it provides desirable behavior when a variable is read multiple times across process boundaries: in particular, the value is only copied once because the output of this op is not a "ref tensor". (If instead the "ref" edge is partitioned between processes, TensorFlow will copy the variable multiple times. This is occasionally useful (if you want to see the effect of a write on a different device in the same step), but it's quite niche.
Variable initialization in TensorFlow is explicit and this can cause headaches (e.g. if you forget to run the initializers for all of your variables). However, the reason we don't do it implicitly is that there are many ways to initialize a variable: from a tensor, from a checkpoint, or from another process (when doing between graph replication). TensorFlow can't guess which one you intend, so it makes the process explicit.
The "loc:@Variable"
syntax is used to colocate nodes on the same device.
In particular, any op that has this value for its _class
attr will be placed on the same device as the Variable
operation.
Retrieving the value of a variable is quite simple: the variable op outputs a tensorflow::Tensor
value, and this value can be copied back through the tensorflow::Session::Run()
API. The Python bindings then copy this value into a NumPy array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With