Using the two functions seems to get the same result.
t4 = tf.get_variable('t4', initializer=tf.random_normal((2,), seed=0))
t5 = tf.get_variable('t5', shape=(2,), initializer=tf.random_normal_initializer(seed=0))
And I find inside the random_normal_initializer()
also use the random_normal()
.
I indistinctly realize the difference between them. The random_normal
will return a constant tensor, but the random_normal_initializer
will return value after init.
I want to know more about how to use of these two functions at the right time.
Does it use random_normal
to init a variable will actually init twice(after init the variable)? In other words, if there performance problems between them.
As far as I know, Variable is the default operation for making a variable, and get_variable is mainly used for weight sharing.
tf. constant_initializer returns an object which when called returns a tensor populated with the value specified in the constructor. This value must be convertible to the requested dtype . The argument value can be a scalar constant value, or a list of values.
Maxim's answer to this question is excellent, but I want to answer a slightly more simple question (with a few examples) that the OP might be asking:
Most basic answer: tf.random_normal
is a Tensor
; buttf.random_normal_initializer
is a RandomNormal
, not a Tensor
. I think simple code best clarifies the difference between these two:
# Simple examples to clarify tf.random_normal from tf.random_normal_initializer
tf.reset_default_graph()
# OP's code
t4 = tf.get_variable('t4', initializer=tf.random_normal((2,), seed=0))
t5 = tf.get_variable('t5', shape=(2,), initializer=tf.random_normal_initializer(seed=0))
# clarifying Tensor vs Initializer outside the context of get_variable.
t6 = tf.random_normal((2,),seed=0)
t7 = tf.random_normal_initializer(seed=0)
# types
print(type(t6)) # <class 'tensorflow.python.framework.ops.Tensor'>
print(type(t7)) # <class 'tensorflow.python.ops.init_ops.RandomNormal'>
# run the graph...
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# OP's code
print(sess.run(t4)) #[-0.39915761 2.10443926]
print(sess.run(t5)) #[-0.39915761 2.10443926]
# tf.random_normal is a Tensor
print(sess.run(t6)) #[-0.39915761 2.10443926]
# tf.random_normal_initializer returns a tf.RandomNormal, not a Tensor or Op, so can't be sess.run()!
try:
print(sess.run(t7)) # Exception!
except:
print("Exception!")
# But notice that you don't need to initialize an initializer, just a variable.
t8 = tf.random_normal_initializer(seed=0)
t9 = tf.get_variable('t9',shape=(2,), initializer=t8)
sess.run(t9.initializer) # still need to initialize the variable
print(sess.run(t9)) #[-0.39915761 2.10443926]
In your setting: Now, as far as the code you are calling goes, there is no real difference; the initializer
keyword is overloaded to accept both and will behave as Maxim indicates. From the tf/ops/variable_scope docs:
if initializer is None:
init, initializing_from_value = self._get_default_initializer(
name=name, shape=shape, dtype=dtype)
if initializing_from_value:
init_shape = None
else:
init_shape = var_shape
elif callable(initializer):
init = initializer
init_shape = var_shape
elif isinstance(initializer, ops.Tensor):
init = array_ops.slice(initializer, var_offset, var_shape)
# Use the dtype of the given tensor.
dtype = init.dtype.base_dtype
init_shape = None
else:
init = ops.convert_to_tensor(initializer, dtype=dtype)
init = array_ops.slice(init, var_offset, var_shape)
init_shape = None
tf.random_normal
returns a tensor of the specified shape filled with random normal values. In addition, it creates a number of under-the-hood ops to compute the value:
random_normal/shape
random_normal/mean
random_normal/stddev
random_normal/RandomStandardNormal
random_normal/mul
At runtime, consecutive evaluations of this tensor produce a new value, but not other nodes are added.
tf.random_normal_initializer
is an Initializer
instance, which invokes tf.random_normal
upon calling. So there is no big difference between tf.random_normal_initializer
and tf.random_normal
. Even if you call the init twice, neither of those will add new nodes to the graph. But both add 6 additional nodes in compilation.
Another alternative (that may be even more efficient in some cases) is initialization with numpy.random.normal
array, like this:
t1 = tf.Variable(name='t1', initial_value=np.random.normal(size=(2,)))
This way no random_normal
nodes are added to the graph, neither in compilation or in runtime.
UPD: tensorflow adds the const op .../initial_value
in this case and the whole numpy array is going to be present in the graph, which may be a problem if the array is large.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With