Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How should OpenAI environments (gyms) use env.seed(0)?

I've created a very simple OpenAI gym (banana-gym) and wonder if / how I should implement env.seed(0).

See https://github.com/openai/gym/issues/250#issuecomment-234126816 for example.

like image 653
Martin Thoma Avatar asked Nov 16 '17 13:11

Martin Thoma


2 Answers

In a recent merge, the developers of OpenAI gym changed the behavior of env.seed() to not call the method env._seed() anymore. Instead the method now just issues a warning and returns. I think if you want to use this method to set the seed of your environment, you should just overwrite it now.

like image 122
Gregor Avatar answered Oct 03 '22 08:10

Gregor


The docstring of the env.seed() function (which can be found in this file) provides the following documentation on what the function should be implemented to do:

Sets the seed for this env's random number generator(s).

    Note:
        Some environments use multiple pseudorandom number generators.
        We want to capture all such seeds used in order to ensure that
        there aren't accidental correlations between multiple generators.
    Returns:
        list<bigint>: Returns the list of seeds used in this env's random
          number generators. The first value in the list should be the
          "main" seed, or the value which a reproducer should pass to
          'seed'. Often, the main seed equals the provided 'seed', but
          this won't be true if seed=None, for example.

Note that, unlike what the documentation and the comments in the issue you linked to seem to imply, it doesn't seem (to me) like env.seed() is supposed to be overridden by custom environments. env.seed() has a very simple implementation, where it only calls and returns the return value of env._seed(), and it seems to me like that is the function which should be overridden by custom environments.

For example, OpenAI gym's atari environments have a custom _seed() implementation which sets the seed used internally by the (C++-based) Arcade Learning Environment.

Since you have a random.random() call in your custom environment, you should probably implement _seed() to call random.seed(). In that way, users of your environments can reproduce experiments by making sure to call seed() on your environment with the same argument.

Note: Messing around with the global random seed like this may be unexpected though, it may be better to create a dedicated random object when your environment gets initialized, seed that object, and make sure to always obtain your random numbers if you need them in the environment from that object.

like image 32
Dennis Soemers Avatar answered Oct 03 '22 07:10

Dennis Soemers