Car Racing

Introduction

This is a side project implemented by me with some insightful advices from wasd9813. In this project, I implemented the dueling DQN to solve the car racing problem. Unlike the project I did before here, the agent can only observe raw pixels of the game screen, which makes the task more challenging. The well-trained agent should have the ability to extract the information from raw pixels, and make good decisions so that the race car will stay on the track instead of running out of it. After training the agent for thousands of episodes, it learned to run on the track, make turns and recover from mistakes(such as running out of the track accidentally). The video recording of the result is posted below.

rl-video-episode-0.mp4

Note: Details of the environment such as reward, observation space and action space is in the link here.

Model Architecture

The game screen is first transformed to grayscale image and cropped in the center, then it is stacked with the next 3 frames of the gameplay which goes through the same process to form a single observation record, this preprocessing method is similar to the method proposed by DeepMind here. The observation is then passed to the dueling DQN, which is composed of convolutional layers and fully connected layers. The visualization of the dueling DQN model architecture is down below.

Parameters of the Model

Hyperparameters

The hyperparameters of the model can be found in the DQN/hyperparams.py file. The exploration rate is set to 1 at the beginning and decays for every 3000 frames(eps_decay_interval) of gameplay with the step size of 0.005, the minimum value of the exploraion rate is set to 0.1 during the experiment.

Model Checkpoints

The process save the model for every 50 episodes of training, this can be modified in main.py.

How to Execute the Process?

$python main.py             # to run the DQN(the default algorithm)
$python main.py --algo duel # to run the dueling DQN

You can also record the video by

$python display.py [--algo] [--model]

for example,

$python display.py --algo duel --model 300

will create an agent with dueling DQN, and the parameters of the network is from DQN/model/duel/agent_params_300.pth.

Since an episode is truncated automatically when it reach the 1000th frame of the gameplay(this mechanism prevents a stationary agent that makes the entire process being executed indefinitely), the RecordVideo wrapper can only record at most 20s of the gameplay.

Evaluation and Result

Policy Comparison

The average reward of random policy is roughly -55.7034(calculated from 100 episodes of random gameplay). The average reward of a human player is 802.745(from 20 episodes of gameplay, the data can be found in result\human\reward.txt). The best trained agent obtained an average reward of 742.591(from 20 episodes of gameplay), which reach human-level gameplay(the best agent is selected by eval.py).

player	average reward
random policy	-55
human	802.745
DRL agent	742.591

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
DQN		DQN
image		image
result		result
upload_model		upload_model
utils		utils
videos		videos
.gitignore		.gitignore
display.py		display.py
eval.py		eval.py
main.py		main.py
random_policy.py		random_policy.py
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Car Racing

Introduction

Model Architecture

Parameters of the Model

Hyperparameters

Model Checkpoints

How to Execute the Process?

Evaluation and Result

Policy Comparison

Dueling DQN Result

DQN Result

References

About

Releases

Packages

Languages

b06b01073/car-racing

Folders and files

Latest commit

History

Repository files navigation

Car Racing

Introduction

Model Architecture

Parameters of the Model

Hyperparameters

Model Checkpoints

How to Execute the Process?

Evaluation and Result

Policy Comparison

Dueling DQN Result

DQN Result

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages