tensorflow-policy-gradient

Still under construction...

Dependencies

Python 2.7
TensorFlow >= 0.8.0
NumPy >= 1.10.0
openai gym
matplotlib

Quick try

Run

python gym_experiment.py

to train a softmax policy (without bias) using vanilla policy gradient on CartPole task. You can see that the return is stochastically increasing until it reaches the maximum (200).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
gym_experiment.py		gym_experiment.py
policy_gradient.py		policy_gradient.py
test_pg.py		test_pg.py
tf_util.py		tf_util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tensorflow-policy-gradient

Dependencies

Quick try

About

Releases

Packages

Languages

crazydonkey200/tensorflow-policy-gradient

Folders and files

Latest commit

History

Repository files navigation

tensorflow-policy-gradient

Dependencies

Quick try

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages