Gymnasium environments for the Search Race CodinGame optimization puzzle and Mad Pod Racing CodinGame bot programming game.
demo.mp4
Action Space | Box([-1, 0], [1, 1], float64) |
Observation Space | Box([0, 0, 0, 0, 0, 0, 0, -1, -1, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], float64) |
import | gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v1") |
To install gymnasium-search-race
with pip, execute:
pip install gymnasium_search_race
From source:
git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .
The action is a ndarray
with 2 continuous variables:
- The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
- The thrust between 0 and 200, normalized between 0 and 1.
The observation is a ndarray
of 10 continuous variables:
- 1 if the next checkpoint is the last one, 0 otherwise.
- The x and y coordinates of the next checkpoint.
- The x and y coordinates of the checkpoint after next checkpoint.
- The x and y coordinates of the car.
- The horizontal speed vx and vertical speed vy of the car.
- The facing angle of the car.
The values are normalized between 0 and 1, or -1 and 1 if negative values are allowed.
The goal is to visit all checkpoints as quickly as possible, as such the agent is penalised with a reward of -0.1
for
each timestep.
When a checkpoint is visited, the agent is awarded with a reward of 1000/total_checkpoints
.
The starting state is generated by choosing a random CodinGame test case.
The episode ends if either of the following happens:
- Termination: The car visit all checkpoints before the time is out.
- Truncation: Episode length is greater than 600.
test_id
: test case id to generate the checkpoints (see choices here). The default value isNone
which selects a test case randomly when thereset
method is called.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v1", test_id=1)
- v1: Add boolean to indicate if the next checkpoint is the last checkpoint in observation
- v0: Initial version
The SearchRaceDiscrete
environment is similar to the SearchRace
environment except the action space is discrete.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v1", test_id=1)
There are 74 discrete actions corresponding to the combinations of angles from -18 to 18 degrees and thrust 0 and 200.
- v1: Add all angles in action space
- v0: Initial version
The MadPodRacing
and MadPodRacingDiscrete
environments can be used to train a runner for
the Mad Pod Racing CodinGame bot programming game.
They are similar to the SearchRace
and SearchRaceDiscrete
environments except the following differences:
- The maximum thrust value is 100 instead of 200.
- The maps are generated the same way Codingame generates them.
- The car position is rounded and not truncated.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacing-v0")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v0")
mad_pod_racing_demo.mp4
The MadPodRacingBlocker
environment can be used to train a blocker for
the Mad Pod Racing CodinGame bot programming game.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v0")
mad_pod_racing_blocker_demo.mp4
You can use RL Baselines3 Zoo to train and evaluate agents:
pip install rl_zoo3
The hyperparameters are defined in hyperparams/ppo.yml
.
To train a PPO agent for the Search Race game, execute:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/SearchRace-v1 \
--tensorboard-log logs \
--eval-freq 20000 \
--eval-episodes 10 \
--gym-packages gymnasium_search_race \
--conf-file hyperparams/ppo.yml \
--progress
For the Mad Pod Racing game, you can add an opponent with the opponent_path
argument:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/MadPodRacingBlocker-v0 \
--tensorboard-log logs \
--eval-freq 20000 \
--eval-episodes 10 \
--gym-packages gymnasium_search_race \
--env-kwargs "opponent_path:'rl-trained-agents/ppo/gymnasium_search_race-MadPodRacing-v0_1/best_model.zip'" \
--conf-file hyperparams/ppo.yml \
--progress
To see a trained agent in action on random test cases, execute:
python -m rl_zoo3.enjoy \
--algo ppo \
--env gymnasium_search_race/SearchRace-v1 \
--n-timesteps 1000 \
--deterministic \
--gym-packages gymnasium_search_race \
--load-best \
--progress
To run test cases with a trained agent, execute:
python -m scripts.run_test_cases \
--path rl-trained-agents/ppo/gymnasium_search_race-SearchRace-v1_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/SearchRace-v1 \
--record-video \
--record-metrics
To record a video of a trained agent on Mad Pod Racing, execute:
python -m scripts.record_video \
--path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacing-v0_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/MadPodRacing-v0
For Mad Pod Racing Blocker, execute:
python -m scripts.record_video \
--path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingBlocker-v0_1/best_model.zip \
--opponent-path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacing-v0_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v0
To run tests, execute:
pytest
To cite the repository in publications:
@misc{gymnasium-search-race,
author = {Quentin Deschamps},
title = {Gymnasium Search Race},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}
- Gymnasium
- RL Baselines3 Zoo
- Stable Baselines3
- CGSearchRace
- CSB-Runner-Arena
- Coders Strikes Back by Magus