When using the MountainCar-v0 environment from OpenAI-gym in Python the value done will be true after 200 time steps. Why is that? Because the goal state isn't reached, the episode shouldn't be done.
import gym
env = gym.make('MountainCar-v0')
env.reset()
for _ in range(300):
env.render()
res = env.step(env.action_space.sample())
print(_)
print(res[2])
I want to run the step method until the car reached the flag and then break the for loop. Is this possible? Something similar to this:
n_episodes = 10
done = False
for i in range(n_episodes):
env.reset()
while done == False:
env.render()
state, reward, done, _ = env.step(env.action_space.sample())
The observation_space defines the structure as well as the legitimate values for the observation of the state of the environment. The observation can be different things for different environments.
Wrapper that override how the environment processes observations, rewards, and action. The following three classes provide this functionality: gym. ObservationWrapper : Used to modify the observations returned by the environment. To do this, override the observation method of the environment.
The current newest version of gym force-stops environment in 200 steps even if you don't use env.monitor.
To avoid this, use
env = gym.make("MountainCar-v0").env
Copied from https://github.com/openai/gym/wiki/FAQ:
Environments are intended to have various levels of difficulty, in order to benchmark the ability of reinforcement learning agents to solve them. Many of the environments are beyond the current state of the art, so don't expect to solve all of them. (If you do, please apply).
If you want to experiment with a variant of an environment that behaves differently, you should give it a new name so that you won't erroneously compare your agent running on an easy variant to someone else's agent running on the original environment. For instance, the MountainCar environment is hard partly because there's a limit of 200 timesteps after which it resets to the beginning. Successful agents must solve it in less than 200 timesteps. For testing purposes, you could make a new environment MountainCarMyEasyVersion-v0 with different parameters by adapting one of the calls to register found in gym/gym/envs/__init__.py
:
gym.envs.register(
id='MountainCarMyEasyVersion-v0',
entry_point='gym.envs.classic_control:MountainCarEnv',
max_episode_steps=250, # MountainCar-v0 uses 200
reward_threshold=-110.0,
)
env = gym.make('MountainCarMyEasyVersion-v0')
Because these environment names are only known to your code, you won't be able to upload it to the scoreboard.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With