Previously, this repository contains the simulation architecture based in Gazebo environment for implementing reinforcement learning algorithm, DDPG for generating bipedal walking patterns for the robot.
But here, I am trying to implement PPO algorithm with the help of Tensorflow Agents.
Still working on...
- Ubuntu 18.04
- ROS Melodic
- Gazebo 7
- TensorFlow: 2
- Tensor-probability: 0.8.0
- TF-Agents: 0.3.0
- gym: 0.9.3
- Python 3.6.9
-
walker_gazebo contains the robot model(both .stl files & .urdf file) and also the gazebo launch file.
-
walker_controller contains the reinforcement learning implementation of DDPG algorithm for control of the bipedal walking robot.
Note: A stable bipedal walking was acheived after training the model using a Nvidia GeForce GTX 1050 Ti GPU enabled system for over 41 hours. The visualization for the horizontal boom(attached to the waist) is turned off.
- Lillicrap, Timothy P., et al. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).
- Silver, David, et al. Deterministic Policy Gradient Algorithms. ICML (2014).
Arun Kumar ([email protected]) & Dr. S N Omkar ([email protected])
Implement state of the art RL algorithms(TRPO & PPO) for the same. Hopefully lead to faster training and less convergence time.