Skip to content

Latest commit

 

History

History
55 lines (42 loc) · 3.21 KB

README.md

File metadata and controls

55 lines (42 loc) · 3.21 KB

DAPG for Dexterous Hand Manipulation

Modified version:

Run with:

cd dapg/examples
python job_script.py --output results/rl_scratch_exp --config cfg/rl_scratch.txt --record_video True --save_id 0 --wandb_activate True --wandb_entity *

or just:

./launch.sh

Note that --record_video and --render cannot be used at the same time for now.

This accompanies the DAPG project, presented at RSS 2018. Please see the project page for the paper and video demonstration of results.

Organization

The overall project is organized into three repositories:

  1. mjrl provides a suite of learning algorithms for various continuous control tasks simulated in MuJoCo. This includes the NPG implementation and the DAPG algorithm used in the paper.
  2. mj_envs provides a suite of continuous control tasks simulated in MuJoCo, including the dexterous hand manipulation tasks used in the paper.
  3. hand_dapg (this repository) serves as the landing page and contains the human demonstrations and pre-trained policies for the tasks.

This modular organization was chosen to allow for rapid and independent developments along different directions such as algorithms and interesting tasks, and also to facilitate sharing of results with the broader research community.

Getting started

Each repository above contains detailed setup instructions.

  1. Step 1: Install mjrl, using instructions in the repository (direct link). mjrl comes with an anaconda environment which helps to easily import and use a variety of MuJoCo tasks.
  2. Step 2: Install mj_envs by following the instructions in the repository. Note that mj_envs uses git submodules, and hence must be cloned correctly per instructions in the repo.
  3. Step 3: After setting up mjrl and mj_envs, clone this repository and use the following commands to visualize the demonstrations and pre-trained policies.
$ cd dapg
$ python utils/visualize_demos.py --env_name relocate-v0
$ python utils/visualize_policy.py --env_name relocate-v0 --policy policies/relocate-v0.pickle

NOTE: If the visualization results in a GLFW error, this is because mujoco-py does not see some graphics drivers correctly. This can usually be fixed by explicitly loading the correct drivers before running the python script. See this page for details.

Bibliography

If you use the code in this or associated repositories above, please cite the following paper.

@INPROCEEDINGS{Rajeswaran-RSS-18,
    AUTHOR    = {Aravind Rajeswaran AND Vikash Kumar AND Abhishek Gupta AND
                 Giulia Vezzani AND John Schulman AND Emanuel Todorov AND Sergey Levine},
    TITLE     = "{Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations}",
    BOOKTITLE = {Proceedings of Robotics: Science and Systems (RSS)},
    YEAR      = {2018},
}