Creating a custom environment in OpenAI Gym - Blocking Maze
This short post introduces how to create your own OpenAI gym environment.
The problem
The environment I’ll create is “Blocking Maze” from Sutton’s RL book [1] (Chapter 8):
The environment is a simple 6x9 grid world with a wall (the shady area). The agent needs to find the rightmost path to reach the goal (left image). In this problem, the wall position will change after some iterations, and the agent needs to find a new optimal path again then (right image)
Image credit: [1]
Code
There’s an official document here How to create new environments for Gym. You can understand what you’ll need based on this.
According to the document, you need files/directories like this:
gym-foo/
README.md
setup.py
gym_foo/
__init__.py
envs/
__init__.py
foo_env.py
foo_extrahard_env.py
So, let’s make it!
1. Top level directory
Create a project directory and README.md first (README is just a blank file at this point).
mkdir blocking_maze
cd blocking-maze
touch README.md
2. setup.py
from setuptools import setup
setup(name='blocking_maze',
version='0.0.1',
install_requires=['gym']
)
3. Create other directories inside
mkdir blocking_maze
cd blocking_maze
mkdir envs
4. blocking-maze/blocking_maze/__init__.py
from gym.envs.registration import register
register(
id='blocking-maze-v01',
entry_point='blocking_maze.envs:MazeEnv1',
)
5. blocking-maze/blocking_maze/envs/__init__.py
from blocking_maze.envs.blocking_maze_env01 import MazeEnv1
6. blocking-maze/blocking_maze/envs/blocking_maze_env01.py
For now, let’s just create a template. You can just copy and paste the snippet from the document.
import gym
from gym import error, spaces, utils
from gym.utils import seeding
class FooEnv(gym.Env):
metadata = {'render.modes': ['human']}
def __init__(self):
print("init")
def step(self, action):
pass
def reset(self):
pass
def render(self, mode='human'):
pass
def close(self):
pass
We need to implement step, reset and render functions, otherwise you’ll see NotImplementedError: gym/gym/core.py
Although the code doesn’t do anything (I’ll provide the full code later), let’s test the installation now.
Install
Move to the parent directory (the directory which contains blocking-maze/), then
pip install -e blocking-maze
Then, you can check like this
$ python
>>> import gym
>>> gym.make('blocking_maze:blocking-maze-v01')
init
<blocking_maze.envs.blocking_maze_env01.MazeEnv1 object at 0x1052b9cf8>
>>>
Alright, seems working :)
Defining the details
We just confirmed that we could use the custom environment. Now let’s define details.
Well, the implementation itself is not complicated, so please check this gist: blocking_maze_env01.py
What you need to do is:
- Define your map (I used ‘w’ to represent wall and whitespace as walkable tiles)
- Implement all necessary functions (step, reset, and render) along with some other helper functions
I also set action_space and observation_space as follows (but this is not necessary. Your code runs without this)
self.action_space = spaces.Discrete(len(self.actions)
self.observation_space = spaces.Discrete(46)
Once you implement the details, test it like this:
import random
import gym
env = gym.make('blocking_maze:blocking-maze-v01')
env.render()
done = False
while not done:
action = random.randrange(4)
observation, reward, done, info = env.step(action)
env.render()
print(env.a_loc)
Yay, it works!
I implemented the switch_maze function which changes a map layout from 1 to 2. You can use this to design your experiment as an example from the book (e.g., environment changes after 1,000 timesteps). Enjoy!