I'm running a reinforcement learning program in a gym environment(BipedalWalker-v2) implemented in tensorflow. I've set the random seed of the environment, tensorflow and numpy manually as follows
os.environ['PYTHONHASHSEED']=str(42)
random.seed(42)
np.random.seed(42)
tf.set_random_seed(42)
env = gym.make('BipedalWalker-v2')
env.seed(0)
config = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
# run the graph with sess
However, I get different results every time I run my program (without changing any code). Why are the results not consistent and what should I do if I want to obtain the same result?
The only places that I can think of may introduce randomness (other than the neural networks) are
tf.truncated_normal
to generate random noise epsilon
so as to implement noisy layernp.random.uniform
to randomly select samples from replay bufferI also spot that the scores I get are pretty consistent at the first 10 episodes, but then begin to differ. Other things such as losses also show a similar trend but are not the same in numeric.
I've also set "PYTHONHASHSEED" and use single-thread CPU as @jaypops96 described, but still cannot reproduce the result. Code has been updated in the above code block
randomSplit(by:seed:)Creates two mutually exclusive, randomly divided subsets of the table.
Seed function is used to save the state of a random function, so that it can generate same random numbers on multiple executions of the code on the same machine or on different machines (for a specific seed value). The seed value is the previous value number generated by the generator.
Operations that rely on a random seed actually derive it from two seeds: the global and operation-level seeds. This sets the global seed. Its interactions with operation-level seeds is as follows: If neither the global seed nor the operation seed is set: A randomly picked seed is used for this op.
A random seed (or seed state, or just seed) is a number (or vector) used to initialize a pseudorandom number generator. For a seed to be used in a pseudorandom number generator, it does not need to be random.
I suggest checking whether your TensorFlow graph contains nondeterministic operations. For example, reduce_sum
before TensorFlow 1.2 was one such operation. These operations are nondeterministic because floating-point addition and multiplication are nonassociative (the order in which floating-point numbers are added or multiplied affects the result) and because such operations don't guarantee their inputs are added or multiplied in the same order every time. See also this question.
EDIT (Sep. 20, 2020): The GitHub repository framework-determinism
has more information about sources of nondeterminism in machine learning frameworks, particularly TensorFlow.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With