hand_vil

Learning Deep Visuomotor Policies for Dexterous Hand Manipulation

Background

The overall project is built on top of these three repositories:

mjrl provides a suite of learning algorithms for various continuous control tasks simulated in MuJoCo. This includes the NPG implementation and the DAPG algorithm used in the paper.
mj_envs provides a suite of continuous control tasks simulated in MuJoCo, including the dexterous hand manipulation tasks used in the paper.
hand_dapg (this repository) serves as the landing page and contains the human demonstrations and pre-trained policies for the tasks.

Setup

Each repository above contains detailed setup instructions.

Step 1: Install mjrl, using instructions in the repository (direct link). mjrl comes with an anaconda environment which helps to easily import and use a variety of MuJoCo tasks.
Step 2: Install mj_envs by following the instructions in the repository. Note that mj_envs uses git submodules, and hence must be cloned correctly per instructions in the repo.
Step 3: After setting up mjrl and mj_envs, Add them to your python path.

$ export PYTHONPATH=$PYTHONPATH:<your_path>/mjrl
$ export PYTHONPATH=$PYTHONPATH:<your_path>/mj_envs

Training the Visuomotor policies

Step 1: Make a "local_settings.py" file and set the variable "MAIN_DIR" to point to the root folder of the project. Consult local_settings.py.sample.
Step 2: Clone this repo. Replace the environment XML files in mj_envs with that present in mjrl_mod/envs/assests, as we need to add the cameras for training the visuomotor policies. Namely:

mj_envs/hand_manipulation_suite/assets/DAPG_door.xml to be relaced by mjrl_mod/envs/assets/DAPG_door.xml
mj_envs/hand_manipulation_suite/assets/DAPG_hammer.xml to be relaced by mjrl_mod/envs/assets/DAPG_hammer.xml
mj_envs/hand_manipulation_suite/assets/DAPG_pen.xml to be relaced by mjrl_mod/envs/assets/DAPG_pen.xml
mj_envs/hand_manipulation_suite/assets/DAPG_relocate.xml to be relaced by mjrl_mod/envs/assets/DAPG_relocate.xml

Step 3 We already have the expert policies for each of the environments fetched from hand_dapg. So we are ready to train the visual policy for any of the above 4 environments.

It is highly reccomended that you use a machine with a GPU for faster training. If you are not planning on using a GPU, make sure to set use_cuda in the config to False.
All the training for the different environments are present in configs/
Move the config that you want to run to the root project directory. For example to use the Hand Hammer config run the following command:

mv configs/config_main_hammer.py config_main.py

Now, we are ready the train the visual model.

$ python run.py

Note that this will save the generated training data to gen_data/data/<name_of_run>/train_data and will save the generated validation data to gen_data/data/<name_of_run>/val_data, and the trained policy to gen_data/data/<name_of_run>/<abbr_run_name>_viz_policy,

Bibliography

If you use the code in this or associated repositories above, please cite the following paper.

~~Put Cutation Here~~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

hand_vil

Background

Setup

Training the Visuomotor policies

Bibliography

Files

README.md

Latest commit

History

README.md

File metadata and controls

hand_vil

Background

Setup

Training the Visuomotor policies

Bibliography