Observations meaning - OpenAI Gym

Tags:

I want to know the specification of the observation of CartPole-v0 in OpenAI Gym(https://gym.openai.com/).

For example, in the following code outputs observation. One observation is like [-0.061586 -0.75893141 0.05793238 1.15547541] I want to know what the numbers mean. And I want any way to know the specification of other Environments such as MountainCar-v0, MsPacman-v0 and so on.

I tried to read https://github.com/openai/gym, but I don't know that. Would you tell me the way to know the specifications?

import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        print(observation)
        action = env.action_space.sample()
        observation, reward, done, info = env.step(action)
        if done:
            print("Episode finished after {} timesteps".format(t+1))
            break

(from https://gym.openai.com/docs)

The output is the following

[-0.061586   -0.75893141  0.05793238  1.15547541]
[-0.07676463 -0.95475889  0.08104189  1.46574644]
[-0.0958598  -1.15077434  0.11035682  1.78260485]
[-0.11887529 -0.95705275  0.14600892  1.5261692 ]
[-0.13801635 -0.7639636   0.1765323   1.28239155]
[-0.15329562 -0.57147373  0.20218013  1.04977545]
Episode finished after 14 timesteps
[-0.02786724  0.00361763 -0.03938967 -0.01611184]
[-0.02779488 -0.19091794 -0.03971191  0.26388759]
[-0.03161324  0.00474768 -0.03443415 -0.04105167]

858

asked Sep 06 '16 05:09

ryo

2 Answers

The observation space used in OpenAI Gym is not exactly the same with the original paper. Look at OpenAI's wiki to find the answer. The observation space is a 4-D space, and each dimension is as follows:

Num Observation Min Max 0 Cart Position -2.4 2.4 1 Cart Velocity -Inf Inf 2 Pole Angle ~ -41.8° ~ 41.8° 3 Pole Velocity At Tip -Inf Inf

164

answered Oct 18 '22 03:10

RoastDuck

After the paragraph describing each environment in OpenAI Gym website, you always have a reference that explains in detail the environment, for example, in the case of CartPole-v0 you can find all details in:

[Barto83] AG Barto, RS Sutton and CW Anderson, "Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problem", IEEE Transactions on Systems, Man, and Cybernetics, 1983.

In that paper you can read that the cart-pole has four state variables:

position of the cart on the track
angle of the pole with the vertical
cart velocity
rate of change of the angle

So, the observation is simply a vector with the value of the four state variables.

Similarly, the details of the MountainCar-v0 can be found in

[Moore90] A Moore, Efficient Memory-Based Learning for Robot Control, PhD thesis, University of Cambridge, 1990.

and so on.

answered Oct 18 '22 03:10

Pablo EM

Related questions
                            
                                Trying to push my app to heroku gives me this error FileNotFoundError: [Errno 2] No such file or directory: '/app/gettingstarted/media'
                            
                                PyQt5 signal-slot decorator example
                            
                                C Preprocessor Macro equivalent for Python
                            
                                django rest framework nested fields with multiple models
                            
                                Django-REST Serializer: Queryset does not filter PrimaryKeyRelatedField results
                            
                                Render HTTP Response(HTML content) in selenium webdriver(browser)
                            
                                Pandas - group by consecutive ranges
                            
                                pandas groupby and rolling_apply ignoring NaNs
                            
                                Stop at exception in my, not library code
                            
                                ipywidgets: Update one widget based on results from another
                            
                                How to filter by NaN in string column in pandas? [duplicate]
                            
                                Add file to tar archive without saving it first
                            
                                Pydub - combine split_on_silence with minimum length / file size
                            
                                Tensorflow reshape tensor
                            
                                Get text with BeautifulSoup CSS Selector
                            
                                IP Spoofing in python 3
                            
                                win32gui.FindWindow Not finding window
                            
                                python partial with keyword arguments
                            
                                Python Pandas : Pivot table : aggfunc concatenate instead of np.size or np.sum
                            
                                Is there a way to prevent dtype from changing from Int64 to float64 when reindexing/upsampling a time-series?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Observations meaning - OpenAI Gym

Tags:

python

machine-learning

deep-learning

reinforcement-learning

openai-gym

ryo

People also ask

2 Answers

RoastDuck

Pablo EM

Recent Activity

Donate For Us