Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why unwrap an openAI gym?

I'm trying to get some insights into reinforcement learning while using openAI gym as a learning environment. I do this by reading the book Hands-on reinforcement learning with Python. In this book, some code is provided. Often, the code doesn't work, because I have to unwrap it first, as shown in: openai gym env.P, AttributeError 'TimeLimit' object has no attribute 'P'

However, I personally am still interested in the WHY of this unwrapping. Why do you need to unwrap? What does this do exactly? And why isn't it coded like that in the book? Is it outdated software as Giuliov assumed?

Thanks in advance.

like image 998
Bram Janssens Avatar asked Dec 18 '18 15:12

Bram Janssens


People also ask

What is ENV unwrapped?

The unwrapped just removes all the wrappers the environment instance has. In OpenAI Gym, you can specify wrapper around the environments in a hierarchical manner.

What is a gym wrapper?

Wrapper that override how the environment processes observations, rewards, and action. The following three classes provide this functionality: gym. ObservationWrapper : Used to modify the observations returned by the environment. To do this, override the observation method of the environment.

Why is OpenAI gym useful?

OpenAI gym is an environment for developing and testing learning agents. It is focused and best suited for reinforcement learning agent but does not restricts one to try other methods such as hard coded game solver / other deep learning approaches.

What is observation space in OpenAI gym?

Our observation space is a continuous space of dimensions (210, 160, 3) corresponding to an RGB pixel observation of the same size.


1 Answers

Open AI Gym offers many different environments. Each of them with their own set of parameters and methods. Nevertheless they generally are wrapped by a single Class (like an interface on real OOPLs) called Env. This class exposes the common most essential methods of any environment, like step, reset and seed. Having this “interface” class is great, because it allows your code to be environment agnostic. It is also makes things easier if you want to test a single agent on different environments.

However, if you want to access the behind-the.scenes dynamics of a specific environment, then you use the unwrapped property.

like image 110
Miotto Avatar answered Oct 09 '22 12:10

Miotto