Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reproducible results in Tensorflow with tf.set_random_seed

Tags:

I am trying to generate N sets of independent random numbers. I have a simple code that shows the problem for 3 sets of 10 random numbers. I notice that even though I use the tf.set_random_seed to set the seed, the results of different runs do not look alike. Any help or comments are greatly appreciated.

(py3p6) bash-3.2$ cat test.py  import tensorflow as tf for i in range(3):   tf.set_random_seed(1234)   generate = tf.random_uniform((10,), 0, 10)   with tf.Session() as sess:     b = sess.run(generate)     print(b) 

This is the output of the code:

# output : [9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128  7.9785547 8.296125  8.388672 ] [8.559105  3.2390785 6.447526  8.316823  1.6297233 1.4103293 2.647568  2.954973  6.5975866 7.494894 ] [2.0277488 6.6134906 0.7579422 4.6359386 6.97507   3.3192968 2.866236  2.2205782 6.7940736 7.2391043] 

I want something like

[9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128  7.9785547 8.296125  8.388672 ] [9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128  7.9785547 8.296125  8.388672 ] [9.604688  5.811516  6.4159    9.621765  0.5434954 4.1893444 5.8865128  7.9785547 8.296125  8.388672 ] 

Update 1: Indeed the reason I had put the seed initializer within the for loop, was because I want to set them differently (think of it as for different MCMC runs, for instance). This is my code which does the job but I am not sure if it's efficient. Basically I generate a couple random seeds between 0 and 2^32-1, and change the seed in each run. Any help or comments to make it more memory/RAM efficient are greatly appreciated.

import numpy as np import tensorflow as tf global_seed = 42 N_chains = 5 np.random.seed(global_seed) seeds = np.random.randint(0, 4294967295, size=N_chains)  for i in range(N_chains):     tf.set_random_seed(seeds[i])     .... some stuff ....     kernel_initializer = tf.random_normal_initializer(seed=seeds[i])     .... some stuff     with tf.Session() as sess:          .... some stuff .....  .  .  . 
like image 667
Mehdi Rezaie Avatar asked Jul 09 '18 16:07

Mehdi Rezaie


People also ask

What does TF random Set_seed do?

This sets the global seed. Its interactions with operation-level seeds is as follows: If neither the global seed nor the operation seed is set: A randomly picked seed is used for this op.

What is random seed in machine learning?

Seed function is used to save the state of a random function, so that it can generate same random numbers on multiple executions of the code on the same machine or on different machines (for a specific seed value). The seed value is the previous value number generated by the generator.

What does seed mean in TensorFlow?

The term "seed" is an abbreviation of the standard term "random seed". TensorFlow operators that produce random results accept an optional seed parameter. If you pass the same number to two instances of the same operator, they will produce the same sequence of results.


1 Answers

In tensorflow, a random operation relies on two different seeds: a global seed, set by tf.set_random_seed, and an operation seed, provided as an argument to the operation. You will find more details on how they relate in the docs.

You have a different seed for each random op because each random op maintains its own internal state for pseudo-random number generation. The reason for having each random generator maintaining its own state is to be robust to change: if they shared the same state, then adding a new random generator somewhere in your graph would change the values produced by all the other generators, defeating the purpose of using a seed.

Now, why do we have this dual system of global and per-op seeds? Well, actually the global seed is not necessary. It is there for convenience: It allows to set all random op seeds to a different and deterministic (if unknown) value at once, without having to go exhaustively through all of them.

Now when a global seed is set but not the op seed, according to the docs,

The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence.

To be more precise, the seed that is provided is the id of the last operation that has been created in the current graph. Consequently, globally-seeded random operation are extremely sensitive to change in the graph, in particular to those created before itself.

For example,

import tensorflow as tf tf.set_random_seed(1234) generate = tf.random_uniform(()) with tf.Session() as sess:   print(generate.eval())   # 0.96046877 

Now if we create a node before, the result changes:

import tensorflow as tf tf.set_random_seed(1234) tf.zeros(()) # new op added before  generate = tf.random_uniform(()) with tf.Session() as sess:   print(generate.eval())   # 0.29252338 

If a node is create after however, it does not affect the op seed:

import tensorflow as tf tf.set_random_seed(1234) generate = tf.random_uniform(()) tf.zeros(()) # new op added after with tf.Session() as sess:   print(generate.eval())   # 0.96046877 

Obviously, as in your case, if you generate several operations, they will have different seeds:

import tensorflow as tf tf.set_random_seed(1234) gen1 = tf.random_uniform(()) gen2 = tf.random_uniform(()) with tf.Session() as sess:   print(gen1.eval())   print(gen2.eval())   # 0.96046877   # 0.85591054 

As a curiosity, and to validate the fact that seeds are simply the last used id in the graph, you could align the seed of gen2 to gen1 with

import tensorflow as tf tf.set_random_seed(1234) gen1 = tf.random_uniform(()) # 4 operations seems to be created after seed has been picked seed = tf.get_default_graph()._last_id - 4 gen2 = tf.random_uniform((), seed=seed) with tf.Session() as sess:   print(gen1.eval())   print(gen2.eval())   # 0.96046877   # 0.96046877 

Obviously though, this should not pass code review.

like image 51
P-Gn Avatar answered Dec 28 '22 23:12

P-Gn