Skip to content

Commit

Permalink
new version (#43)
Browse files Browse the repository at this point in the history
  • Loading branch information
christopherhesse authored Jun 3, 2020
1 parent 04d2005 commit 615e751
Show file tree
Hide file tree
Showing 55 changed files with 2,848 additions and 1,988 deletions.
9 changes: 9 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
# Changelog

## 0.10.0

* add `set_state`, `get_state` methods to save/restore environment state
* new flags: `use_backgrounds`, `restrict_themes`, `use_monocrhome_assets`
* switch to use `gym3` instead of `libenv` + `Scalarize`, `gym` and `baselines.VecEnv` interfaces are still available with the same names, the `gym3` environment is called `ProcgenGym3Env`
* zero initialize more member variables
* changed `info` dict to have more clear keys, `prev_level_complete` tells you if the level was complete on the previous timestep, since the `info` dict corresponds to the current timestep, and the current timestep is never on a complete level due to automatic resetting. Similarly, `prev_level_seed` is the level seed from the previous timestep.
* environment creation should be slightly faster

## 0.9.5

* zero initialize member variables from base classes
Expand Down
102 changes: 80 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ These environments are associated with the paper [Leveraging Procedural Generati
Compared to [Gym Retro](https://github.com/openai/retro), these environments are:

* Faster: Gym Retro environments are already fast, but Procgen environments can run >4x faster.
* Non-deterministic: Gym Retro environments are always the same, so you can memorize a sequence of actions that will get the highest reward. Procgen environments are randomized so this is not possible.
* Randomized: Gym Retro environments are always the same, so you can memorize a sequence of actions that will get the highest reward. Procgen environments are randomized so this is not possible.
* Customizable: If you install from source, you can perform experiments where you change the environments, or build your own environments. The environment-specific code for each environment is often less than 300 lines. This is almost impossible with Gym Retro.

Supported platforms:
Expand Down Expand Up @@ -56,7 +56,7 @@ To try an environment out interactively:
python -m procgen.interactive --env-name coinrun
```

The keys are: left/right/up/down + q, w, e, a, s, d for the different (environment-dependent) actions. Your score is displayed as "episode_return" on the right. At the end of an episode, you can see your final "episode_return" as well as "level_completed" which will be `1` if you successfully completed the level.
The keys are: left/right/up/down + q, w, e, a, s, d for the different (environment-dependent) actions. Your score is displayed as "episode_return" in the lower left. At the end of an episode, you can see your final "episode_return" as well as "prev_level_complete" which will be `1` if you successfully completed the level.

To create an instance of the [gym](https://github.com/openai/gym) environment:

Expand All @@ -65,22 +65,20 @@ import gym
env = gym.make("procgen:procgen-coinrun-v0")
```

To create an instance of the vectorized environment:
To create an instance of the [gym3](https://github.com/openai/gym3) (vectorized) environment:

```
from procgen import ProcgenEnv
venv = ProcgenEnv(num_envs=1, env_name="coinrun")
from procgen import ProcgenGym3Env
env = ProcgenGym3Env(num_envs=1, env_name="coinrun")
```

The environment uses the [`VecEnv`](https://github.com/openai/baselines/blob/master/baselines/common/vec_env/vec_env.py#L29) interface from [`baselines`](https://github.com/openai/baselines), `baselines` is not a dependency of this library.

### Docker

A [`Dockerfile`](docker/Dockerfile) is included to demonstrate a minimal Docker-based setup that works for running random agent.

```
docker build docker --tag procgen
docker run --rm -it procgen python3 -m procgen.examples.random_agent
docker run --rm -it procgen python3 -m procgen.examples.random_agent_gym
```

## Environments
Expand Down Expand Up @@ -115,14 +113,18 @@ Here are the 16 environments:
## Environment Options

* `env_name` - Name of environment, or comma-separate list of environment names to instantiate as each env in the VecEnv.
* `num_levels` - The number of unique levels that can be generated. Set to 0 to use unlimited levels.
* `start_level` - The lowest seed that will be used to generated levels. 'start_level' and 'num_levels' fully specify the set of possible levels.
* `paint_vel_info` - Paint player velocity info in the top left corner. Only supported by certain games.
* `use_generated_assets` - Use randomly generated assets in place of human designed assets.
* `debug_mode` - A useful flag that's passed through to procgen envs. Use however you want during debugging.
* `center_agent` - Determines whether observations are centered on the agent or display the full level. Override at your own risk.
* `use_sequential_levels` - When you reach the end of a level, the episode is ended and a new level is selected. If `use_sequential_levels` is set to `True`, reaching the end of a level does not end the episode, and the seed for the new level is derived from the current level seed. If you combine this with `start_level=<some seed>` and `num_levels=1`, you can have a single linear series of levels similar to a gym-retro or ALE game.
* `distribution_mode` - What variant of the levels to use, the options are `"easy", "hard", "extreme", "memory", "exploration"`. All games support `"easy"` and `"hard"`, while other options are game-specific. The default is `"hard"`. Switching to `"easy"` will reduce the number of timesteps required to solve each game and is useful for testing or when working with limited compute resources.
* `num_levels=0` - The number of unique levels that can be generated. Set to 0 to use unlimited levels.
* `start_level=0` - The lowest seed that will be used to generated levels. 'start_level' and 'num_levels' fully specify the set of possible levels.
* `paint_vel_info=False` - Paint player velocity info in the top left corner. Only supported by certain games.
* `use_generated_assets=False` - Use randomly generated assets in place of human designed assets.
* `debug=False` - Set to `True` to use the debug build if building from source.
* `debug_mode=0` - A useful flag that's passed through to procgen envs. Use however you want during debugging.
* `center_agent=True` - Determines whether observations are centered on the agent or display the full level. Override at your own risk.
* `use_sequential_levels=False` - When you reach the end of a level, the episode is ended and a new level is selected. If `use_sequential_levels` is set to `True`, reaching the end of a level does not end the episode, and the seed for the new level is derived from the current level seed. If you combine this with `start_level=<some seed>` and `num_levels=1`, you can have a single linear series of levels similar to a gym-retro or ALE game.
* `distribution_mode="hard"` - What variant of the levels to use, the options are `"easy", "hard", "extreme", "memory", "exploration"`. All games support `"easy"` and `"hard"`, while other options are game-specific. The default is `"hard"`. Switching to `"easy"` will reduce the number of timesteps required to solve each game and is useful for testing or when working with limited compute resources.
* `use_backgrounds=True` - Normally games use human designed backgrounds, if this flag is set to `False`, games will use pure black backgrounds.
* `restrict_themes=False` - Some games select assets from multiple themes, if this flag is set to `True`, those games will only use a single theme.
* `use_monochrome_assets=False` - If set to `True`, games will use monochromatic rectangles instead of human designed assets. best used with `restrict_themes=True`.

Here's how to set the options:

Expand All @@ -131,19 +133,33 @@ import gym
env = gym.make("procgen:procgen-coinrun-v0", start_level=0, num_levels=1)
```

For the vectorized environment:
Since the gym environment is adapted from a gym3 environment, early calls to `reset()` are disallowed and the `render()` method does not do anything. To render the environment, pass `render=True`, which will set `render_human=True` to the environment and wrap it in a `gym3.ViewerWrapper`.

For the gym3 vectorized environment:

```
from procgen import ProcgenGym3Env
env = ProcgenGym3Env(num=1, env_name="coinrun", start_level=0, num_levels=1)
```

## Saving and loading the environment state

If you are using the gym3 interface, you can save and load the environment state:

```
from procgen import ProcgenEnv
venv = ProcgenEnv(num_envs=1, env_name="coinrun", start_level=0, num_levels=1)
from procgen import ProcgenGym3Env
env = ProcgenGym3Env(num=1, env_name="coinrun", start_level=0, num_levels=1)
states = env.callmethod("get_state")
env.callmethod("set_state", states)
```

This returns a list of byte strings representing the state of each game in the vectorized environment.

## Notes

* You should depend on a specific version of this library (using `==`) for your experiments to ensure they are reproducible. You can get the current installed version with `pip show procgen`.
* This library does not require or make use of GPUs.
* While the library should be thread safe, each individual environment instance should only be used from a single thread. The library is not fork safe unless you set `num_threads=0`. Even if you do that, `Qt` is not guaranteed to be fork safe, so you should probably create the environment after forking or not use fork at all.
* Calling `reset()` early will not do anything, please re-create the environment if you want to reset it early.

# Install from Source

Expand All @@ -156,12 +172,12 @@ conda env update --name procgen --file environment.yml
conda activate procgen
pip install -e .
# this should say "building procgen...done"
python -c "from procgen import ProcgenEnv; ProcgenEnv(num_envs=1, env_name='coinrun')"
python -c "from procgen import ProcgenGym3Env; ProcgenGym3Env(num=1, env_name='coinrun')"
# this should create a window where you can play the coinrun environment
python -m procgen.interactive
```

The environment code is in C++ and is compiled into a shared library loaded by python using a C interface based on [`libenv`](https://github.com/cshesse/libenv). The C++ code uses [Qt](https://www.qt.io/) for drawing.
The environment code is in C++ and is compiled into a shared library loaded by python using a C interface using [`gym3.libenv`](https://github.com/gym3). The C++ code uses [Qt](https://www.qt.io/) for drawing.

# Create a new environment

Expand All @@ -174,6 +190,48 @@ Once you have installed from source, you can customize an existing environment o

This repo includes a travis configuration that will compile your environment and build python wheels for easy installation. In order to have this build more quickly by caching the Qt compilation, you will want to configure a GCS bucket in [common.py](https://github.com/openai/procgen/blob/master/procgen-build/procgen_build/common.py#L5) and [setup service account credentials](https://github.com/openai/procgen/blob/master/procgen-build/procgen_build/build_package.py#L41).

# Add information to the info dictionary

To export game information from the C++ game code to Python, you can define a new `info_type`. `info_type`s appear in the `info` dict returned by the gym environment, or in `get_info()` from the gym3 environment.

To define a new one, add the following code to the `VecGame` constructor here: [vecgame.cpp](https://github.com/openai/procgen/blob/master/procgen/src/vecgame.cpp#L290)

```
{
struct libenv_tensortype s;
strcpy(s.name, "heist_key_count");
s.scalar_type = LIBENV_SCALAR_TYPE_DISCRETE;
s.dtype = LIBENV_DTYPE_INT32;
s.ndim = 0,
s.low.int32 = 0;
s.high.int32 = INT32_MAX;
info_types.push_back(s);
}
```

This lets the Python code know to expect a single integer and expose it in the `info` dict.

After adding that, you can add the following code to [heist.cpp](https://github.com/openai/procgen/blob/master/procgen/src/games/heist.cpp#L93):

```
void observe() override {
Game::observe();
int32_t key_count = 0;
for (const auto& has_key : has_keys) {
if (has_key) {
key_count++;
}
}
*(int32_t *)(info_bufs[info_name_to_offset.at("heist_key_count")]) = key_count;
}
```

This populates the `heist_key_count` info value each time the environment is observed.

If you run the interactive script (making sure that you installed from source), the new keys should appear in the bottom left hand corner:

`python -m procgen.interactive --env-name heist`

# Changelog

See [CHANGES](CHANGES.md) for changes present in each release.
Expand Down
7 changes: 2 additions & 5 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,7 @@ dependencies:
- qt=5.12.5 # conda-forge does not have 5.13.2 available
- pip
- pip:
- gym3==0.3.0
- numpy==1.17.2
- gym==0.15.3
- filelock==3.0.10
- cffi==1.13.2
- pyglet==1.3.2
- imageio==2.6.1
- imageio-ffmpeg==0.3.0
- filelock==3.0.10
21 changes: 17 additions & 4 deletions procgen-build/procgen_build/build_package.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
from urllib.request import urlretrieve
import os
import subprocess as sp
import fnmatch

import blobfile as bf

Expand Down Expand Up @@ -37,6 +38,13 @@ def init_vsvars():
os.environ[k] = v


def get_var(pattern):
for key, value in os.environ:
if fnmatch.fnmatch(key, pattern):
return os.environ[key]
return None


def setup_google_credentials():
# brew install travis
# travis login --org
Expand All @@ -46,10 +54,15 @@ def setup_google_credentials():
# travis encrypt-file --org /tmp/key.json
input_path = os.path.join(SCRIPT_DIR, "key.json.enc")
output_path = os.path.join(os.getcwd(), "key.json")
if "encrypted_d853b3b05b79_key" not in os.environ:
for h in ["d853b3b05b79", "41b34d34b52c"]:
key = os.environ.get(f"encrypted_{h}_key")
iv = os.environ.get(f"encrypted_{h}_iv")
if key is not None:
break
if key is None:
# being compiled on a fork
return False
sp.run(["openssl", "aes-256-cbc", "-K", os.environ["encrypted_d853b3b05b79_key"], "-iv", os.environ["encrypted_d853b3b05b79_iv"], "-in", input_path, "-out", output_path, "-d"], check=True)
sp.run(["openssl", "aes-256-cbc", "-K", key, "-iv", iv, "-in", input_path, "-out", output_path, "-d"], check=True)
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = output_path
return True

Expand All @@ -59,7 +72,7 @@ def main():

os.environ.update(
{
"CIBW_BUILD": "cp36-macosx_10_6_intel cp37-macosx_10_6_intel cp38-macosx_10_9_x86_64 cp36-manylinux_x86_64 cp37-manylinux_x86_64 cp38-manylinux_x86_64 cp36-win_amd64 cp37-win_amd64 cp38-win_amd64",
"CIBW_BUILD": "cp36-macosx_x86_64 cp37-macosx_x86_64 cp38-macosx_x86_64 cp36-manylinux_x86_64 cp37-manylinux_x86_64 cp38-manylinux_x86_64 cp36-win_amd64 cp37-win_amd64 cp38-win_amd64",
"CIBW_BEFORE_BUILD": "pip install -e procgen-build && python -u -m procgen_build.build_qt --output-dir /tmp/qt5",
"CIBW_TEST_EXTRAS": "test",
# the --pyargs option causes pytest to use the installed procgen wheel
Expand Down Expand Up @@ -95,7 +108,7 @@ def main():
elif platform.system() == "Windows":
init_vsvars()

run("pip install cibuildwheel==1.0.0")
run("pip install cibuildwheel==1.4.1")
run("cibuildwheel --output-dir wheelhouse")

if have_credentials:
Expand Down
7 changes: 5 additions & 2 deletions procgen-build/procgen_build/common.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,17 @@
import subprocess as sp
import time
import shlex


GCS_BUCKET = "openai-procgen"


def run(cmd, **kwargs):
def run(cmd, shell=True, **kwargs):
print(f"RUN: {cmd}")
start = time.time()
p = sp.run(cmd, shell=True, encoding="utf8", **kwargs)
if not shell:
cmd = shlex.split(cmd)
p = sp.run(cmd, shell=shell, encoding="utf8", **kwargs)
print(f"ELAPSED: {time.time() - start}")
if p.returncode != 0:
print(f"cmd {cmd} failed")
Expand Down
38 changes: 21 additions & 17 deletions procgen-build/procgen_build/dev_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,28 +19,32 @@ def main():
if platform.system() == "Linux":
apt_install(["mesa-common-dev"])

installer_urls = {
"Linux": "https://repo.anaconda.com/miniconda/Miniconda3-4.7.12.1-Linux-x86_64.sh",
"Darwin": "https://repo.anaconda.com/miniconda/Miniconda3-4.7.12.1-MacOSX-x86_64.sh",
"Windows": "https://repo.anaconda.com/miniconda/Miniconda3-4.7.12.1-Windows-x86_64.exe",
}
installer_url = installer_urls[platform.system()]
urlretrieve(
installer_url,
"miniconda-installer.exe" if platform.system() == "Windows" else "miniconda-installer.sh",
)
if platform.system() == "Windows":
# using the installer seems to hang so use chocolatey instead
run("choco install miniconda3 --version 4.7.12.1 --no-progress --yes")
os.environ["PATH"] = "C:\\tools\\miniconda3;C:\\tools\\miniconda3\\Library\\bin;C:\\tools\\miniconda3\\Scripts;" + os.environ["PATH"]
run("miniconda-installer.exe /S /D=c:\\miniconda3")
os.environ["PATH"] = "C:\\miniconda3;C:\\miniconda3\\Library\\bin;C:\\miniconda3\\Scripts;" + os.environ["PATH"]
else:
installer_urls = {
"Linux": "https://repo.anaconda.com/miniconda/Miniconda2-4.7.12.1-Linux-x86_64.sh",
"Darwin": "https://repo.anaconda.com/miniconda/Miniconda2-4.7.12.1-MacOSX-x86_64.sh",
}
installer_url = installer_urls[platform.system()]
urlretrieve(
installer_url,
"miniconda-installer.sh",
)
conda_path = os.path.join(os.getcwd(), "miniconda")
run(f"bash miniconda-installer.sh -b -p {conda_path}")
os.environ["PATH"] = f"/{conda_path}/bin/:" + os.environ["PATH"]
run("conda env update --name base --file environment.yml")
run("conda init")
run("pip install -e .[test]")
run("""python -c "from procgen import ProcgenEnv; ProcgenEnv(num_envs=1, env_name='coinrun')" """)
run("pytest --verbose --benchmark-disable --durations=16 .")

def run_in_conda_env(cmd):
run(f"conda run --name dev {cmd}", shell=False)

run("conda env update --name dev --file environment.yml")
run_in_conda_env("pip show gym3")
run_in_conda_env("pip install -e .[test]")
run_in_conda_env("""python -c "from procgen import ProcgenGym3Env; ProcgenGym3Env(num=1, env_name='coinrun')" """)
run_in_conda_env("pytest --verbose --benchmark-disable --durations=16 .")


if __name__ == "__main__":
Expand Down
2 changes: 2 additions & 0 deletions procgen-build/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,7 @@
"blobfile==0.8.0",
# rather than rely on system cmake, install it here
"cmake==3.15.3",
# this is required by procgen/build.py
"gym3==0.3.0",
],
)
6 changes: 3 additions & 3 deletions procgen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -36,9 +36,6 @@ if (APPLE OR UNIX)
set(CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -fno-omit-frame-pointer")
endif()

# find libenv.h header
set(CMAKE_INCLUDE_CURRENT_DIR ON)

# include qt5
find_package(Qt5 COMPONENTS Gui REQUIRED)

Expand Down Expand Up @@ -75,4 +72,7 @@ add_library(env
src/vecoptions.cpp
)

# find libenv.h header
target_include_directories(env PUBLIC ${LIBENV_DIR})

target_link_libraries(env Qt5::Gui)
4 changes: 2 additions & 2 deletions procgen/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
version_path = os.path.join(SCRIPT_DIR, "version.txt")
__version__ = open(version_path).read()

from .env import ProcgenEnv
from .env import ProcgenEnv, ProcgenGym3Env
from .gym_registration import register_environments

register_environments()

__all__ = ["ProcgenEnv"]
__all__ = ["ProcgenEnv", "ProcgenGym3Env"]
Loading

0 comments on commit 615e751

Please sign in to comment.