Learning Deep Visuomotor Policies for Dexterous Hand Manipulation
The overall project is built on top of these three repositories:
- mjrl provides a suite of learning algorithms for various continuous control tasks simulated in MuJoCo. This includes the NPG implementation and the DAPG algorithm used in the paper.
- mj_envs provides a suite of continuous control tasks simulated in MuJoCo, including the dexterous hand manipulation tasks used in the paper.
- hand_dapg (this repository) serves as the landing page and contains the human demonstrations and pre-trained policies for the tasks.
Each repository above contains detailed setup instructions.
- Step 1: Install mjrl, using instructions in the repository (direct link).
mjrl
comes with an anaconda environment which helps to easily import and use a variety of MuJoCo tasks. - Step 2: Install mj_envs by following the instructions in the repository. Note that
mj_envs
uses git submodules, and hence must be cloned correctly per instructions in the repo. - Step 3: After setting up
mjrl
andmj_envs
, Add them to your python path.
$ export PYTHONPATH=$PYTHONPATH:<your_path>/mjrl
$ export PYTHONPATH=$PYTHONPATH:<your_path>/mj_envs
-
Step 1: Make a "local_settings.py" file and set the variable "MAIN_DIR" to point to the root folder of the project. Consult local_settings.py.sample.
-
Step 2: Clone this repo. Replace the environment XML files in mj_envs with that present in mjrl_mod/envs/assests, as we need to add the cameras for training the visuomotor policies. Namely:
- mj_envs/hand_manipulation_suite/assets/DAPG_door.xml to be relaced by mjrl_mod/envs/assets/DAPG_door.xml
- mj_envs/hand_manipulation_suite/assets/DAPG_hammer.xml to be relaced by mjrl_mod/envs/assets/DAPG_hammer.xml
- mj_envs/hand_manipulation_suite/assets/DAPG_pen.xml to be relaced by mjrl_mod/envs/assets/DAPG_pen.xml
- mj_envs/hand_manipulation_suite/assets/DAPG_relocate.xml to be relaced by mjrl_mod/envs/assets/DAPG_relocate.xml
- Step 3 We already have the expert policies for each of the environments fetched from hand_dapg. So we are ready to train the visual policy for any of the above 4 environments.
- It is highly reccomended that you use a machine with a GPU for faster training. If you are not planning on using a GPU, make sure to set
use_cuda
in the config to False. - All the training for the different environments are present in configs/
- Move the config that you want to run to the root project directory. For example to use the Hand Hammer config run the following command:
mv configs/config_main_hammer.py config_main.py
- Now, we are ready the train the visual model.
$ python run.py
Note that this will save the generated training data to gen_data/data/<name_of_run>/train_data
and will save the generated validation data to gen_data/data/<name_of_run>/val_data
, and the trained policy
to gen_data/data/<name_of_run>/<abbr_run_name>_viz_policy
,
If you use the code in this or associated repositories above, please cite the following paper.
~~Put Cutation Here~~