Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I need to initialize variables in TensorFlow?

I primarily develop my models in R and I am currently learning TensorFlow. I'm going through a tutorial with the following code

raw_data = [1., 2., 8., -1., 0., 5.5, 6., 13] 
spike = tf.Variable(False)
spike.initializer.run()

for i in range(1, len(raw_data)): 
    if raw_data[i] - raw_data[i-1] > 5:
        updater = tf.assign(spike, True)
        updater.eval()
    else:
        tf.assign(spike, False).eval()
    print("Spike", spike.eval())
sess.close()

From a layman's perspective, why do I need to initialize and Variabalize in TensorFlow? I know this may be a basic question but it's something not dealt with in R.

like image 747
DataTx Avatar asked Jan 01 '18 22:01

DataTx


1 Answers

Let's have a look at what the script actually does:

spike = tf.Variable(False)

This line creates a symbolic variable or a node in the computational graph, with a constant initializer. At this point, nothing's been allocated for this variable. On top of that, it's not even known yet on which device (CPU or GPU) it's going to be placed.

Next,

spike.initializer.run()

This line runs the spike initializer in the default session, that you've already started.

Note that, first of all, although the code is perfectly valid, it's not typical in real application. More commonly, there's separation of responsibilities: the model is defined in one or more source files and gets executed in another file or files. The initialization logically belongs to the latter, because only when the session starts, the memory gets allocated.

Secondly, const is not the only option to initialize the variable. E.g., Xavier initializer needs to have the whole graph structure to compute the number of incoming and outcoming connections, and deduce the standard deviation from them. It simply won't work if we tried to initialize the variable during the definition.

I hope now tensorflow design gets clearer: initializer is a dedicated op. Specifically for your use-case, tensorflow has released eager mode, an imperative, define-by-run interface where operations are executed immediately as they are called from Python.

You can start it like this:

import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()

... and it will save you from the boilerplate, like above.

like image 104
Maxim Avatar answered Oct 12 '22 03:10

Maxim