I can't find an exact description of the differences between the OpenAI Gym environments 'CartPole-v0' and 'CartPole-v1'.
Both environments have seperate official websites dedicated to them at (see 1 and 2), though I can only find one code without version identification in the gym github repository (see 3). I also checked out the what files exactly are loaded via the debugger, though they both seem to load the same aforementioned file. The only difference seems to be in the their internally assigned max_episode_steps
and reward_threshold
, which can be accessed as seen below. CartPole-v0 has the values 200/195.0 and CartPole-v1 has the values 500/475.0. The rest seems identical at first glance.
import gym
env = gym.make("CartPole-v1")
print(self.env.spec.max_episode_steps)
print(self.env.spec.reward_threshold)
I would therefore appreciate it if someone could describe the exact differences for me or forward me to a website that is doing so. Thank you very much!
A CartPole-v0 is a simple playground provided by OpenAI to train and test Reinforcement Learning algorithms. The agent is the cart, controlled by two possible actions +1, -1 pointing on moving left or right. The reward +1 is given at every timestep if the pole remains upright.
By Ayoosh Kathuria. OpenAI Gym comes packed with a lot of awesome environments, ranging from environments featuring classic control tasks to ones that let you train your agents to play Atari games like Breakout, Pacman, and Seaquest.
The observation_space defines the structure as well as the legitimate values for the observation of the state of the environment. The observation can be different things for different environments. The most common form is a screenshot of the game.
As you probably have noticed, in OpenAI Gym sometimes there are different versions of the same environments. The different versions usually share the main environment logic but some parameters are configured with different values. These versions are managed using a feature called the registry.
In the case of the CartPole environment, you can find the two registered versions in this source code. As you can see in lines 50 to 65, there exist two CartPole versions, tagged as v0 and v1, whose differences are the parameters max_episode_steps
and reward_threshold
:
register(
id='CartPole-v0',
entry_point='gym.envs.classic_control:CartPoleEnv',
max_episode_steps=200,
reward_threshold=195.0,
)
register(
id='CartPole-v1',
entry_point='gym.envs.classic_control:CartPoleEnv',
max_episode_steps=500,
reward_threshold=475.0,
)
Both parameters confirm your guess about the difference between CartPole-v0 and CartPole-v1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With