Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between OpenAI Gym environments 'CartPole-v0' and 'CartPole-v1'

I can't find an exact description of the differences between the OpenAI Gym environments 'CartPole-v0' and 'CartPole-v1'.

Both environments have seperate official websites dedicated to them at (see 1 and 2), though I can only find one code without version identification in the gym github repository (see 3). I also checked out the what files exactly are loaded via the debugger, though they both seem to load the same aforementioned file. The only difference seems to be in the their internally assigned max_episode_steps and reward_threshold, which can be accessed as seen below. CartPole-v0 has the values 200/195.0 and CartPole-v1 has the values 500/475.0. The rest seems identical at first glance.

import gym

env = gym.make("CartPole-v1")
print(self.env.spec.max_episode_steps)
print(self.env.spec.reward_threshold)

I would therefore appreciate it if someone could describe the exact differences for me or forward me to a website that is doing so. Thank you very much!

like image 953
PaulOnStackoverflow Avatar asked Jul 05 '19 13:07

PaulOnStackoverflow


People also ask

What is CartPole v0?

A CartPole-v0 is a simple playground provided by OpenAI to train and test Reinforcement Learning algorithms. The agent is the cart, controlled by two possible actions +1, -1 pointing on moving left or right. The reward +1 is given at every timestep if the pole remains upright.

WHAT IS environment in OpenAI gym?

By Ayoosh Kathuria. OpenAI Gym comes packed with a lot of awesome environments, ranging from environments featuring classic control tasks to ones that let you train your agents to play Atari games like Breakout, Pacman, and Seaquest.

What is observation space in OpenAI gym?

The observation_space defines the structure as well as the legitimate values for the observation of the state of the environment. The observation can be different things for different environments. The most common form is a screenshot of the game.


1 Answers

As you probably have noticed, in OpenAI Gym sometimes there are different versions of the same environments. The different versions usually share the main environment logic but some parameters are configured with different values. These versions are managed using a feature called the registry.

In the case of the CartPole environment, you can find the two registered versions in this source code. As you can see in lines 50 to 65, there exist two CartPole versions, tagged as v0 and v1, whose differences are the parameters max_episode_steps and reward_threshold:

register(
    id='CartPole-v0',
    entry_point='gym.envs.classic_control:CartPoleEnv',
    max_episode_steps=200,
    reward_threshold=195.0,
)

register(
    id='CartPole-v1',
    entry_point='gym.envs.classic_control:CartPoleEnv',
    max_episode_steps=500,
    reward_threshold=475.0,
)

Both parameters confirm your guess about the difference between CartPole-v0 and CartPole-v1.

like image 167
Pablo EM Avatar answered Oct 18 '22 09:10

Pablo EM