add mujoco pg codes and unity ml-agent codes

reinforcement-learning-kr · Jul 15, 2018 · 874f1ad · 874f1ad
1 parent 1b682e2
commit 874f1ad
Show file tree

Hide file tree

Showing 263 changed files with 125,476 additions and 577 deletions.
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2018 Woongwon Lee
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -1,2 +1,14 @@
-# pg_travel
-Policy Gradient algorithms (REINFORCE, vanilla actor-critic, DPG, NPG, TRPO, PPO)
+# PG Travel
+PyTorch implementation of Vanilla Policy Gradient, Truncated Natural Policy Gradient, Trust Region Policy Optimization, Proximal Policy Optimization
+
+# Train
+* **algorithm**: PG, NPG, TRPO, PPO
+* **env**: Ant-v2, HalfCheetah-v2, Hopper-v2, Humanoid-v2, HumanoidStandup-v2, InvertedPendulum-v2, Reacher-v2, Swimmer-v2, Walker2d-v2
+~~~
+python train.py --algorithm "algorithm name" --env "environment name"
+~~~
+
+# Reference
+This code is modified version of codes
+* [OpenAI Baseline](https://github.com/openai/baselines/tree/master/baselines/trpo_mpi)
+* [Pytorch implemetation of TRPO](https://github.com/ikostrikov/pytorch-trpo)
diff --git a/cartpole/linear_pg.py b/cartpole/linear_pg.py