Whats this

This is a very simple implementation for Path Consistency Learning (PCL). It currently only supports environments with a discrete action space and very simple environments.

Usage

First install the requirements

$ pip install -r requirements.txt

To run training on the CartPole environment:

$ python main.py

This logs the loss, reward and average sequence length to tensorboard, which can be viewed with

$ tensorboard --logdir=runs

Currently the implementation depends on initialization a lot, so you might need a few runs to get good results.

Results

A very simple model for the cartpole environment is provided under res/models/cart_pole.

You can see it acting by running:

$ python test_model.py

Todo:

Add unified PCL
Test on more complex environments
Use epsilon-greedy strategy in the beginning to force exploration
Implement prioritized replay buffer as described in the paper
Test how expert trajectories improve convergence speed

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
res/models/cart_pole		res/models/cart_pole
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whats this

Usage

Results

Todo:

About

Releases

Packages

Languages

Hoff97/path-consistency-learning

Folders and files

Latest commit

History

Repository files navigation

Whats this

Usage

Results

Todo:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages