UPESI

Code for paper Not Only Domain Randomization: Universal Policy with Embedding System Identification.

Installation

This repo uses the same environment named robolite, which is a modified verison of robosuite to support domain randomisation and inverse kinematics (IK). Our modified environment is also used in another project.

If you're installing this repo for the first time, please ensure that you have anaconda installed, and have IsaacGym_Preview_3_Package.tar.gz from nvidia official website in this folder (upesi), then run ./initialization.sh using bash without super user privilege.

You'll then get a conda environment named rlgpu.

isaacgymenvs, robolite will be created as siblings of this directory.

You should always use bash to run commands.

Citation:

Please cite the our paper if you make use of this repo:

@article{ding2021not,
  title={Not Only Domain Randomization: Universal Policy with Embedding System Identification},
  author={Ding, Zihan},
  journal={arXiv preprint arXiv:2109.13438},
  year={2021}
}

Training Procedure

For the universal policy (UP) with embedding system identification (ESI), we use the following commands.

First pretrained models are needed for each environment to rollout samples for further usage (learn the dynamics prediction in our method):

Get pretrained model

Remember to suspend parameter randomization (set randomized_params=None in ./default_params.py) for getting this policy.

python train.py basic.env_name=inverteddoublependulum

as an example for the InvertedDoublePendulum environment, using TD3 algorithm for training. After training, there will be weights in the data folder. You just need to replace the model path in later scripts with the one you got to make it run.

Go to the directory:

 cd dynamics_predict

Collect training and testing dataset

python train_dynamics.py --collect_train_data --env Env_NAME
python train_dynamics.py --collect_test_data --env Env_NAME

Normailize data Run

 cd ../data/dynamics_data
 jupyter notebook

and open data_process_*ENV_NAME*.ipynb and go through each cell.

Train dynamics embedding (encoder, decoder and dynamics prediction model)

Back to the terminal in dynamics_predict/.

Run the following to lauch training,

python train_dynamics.py --train_embedding --env Env_NAME

and use launch tensorboard --logdir runs to monitor the training process.

Test learned encoder and dynamics predictor Test the preformance of learned encoder and dynamics predictor by applying them in ESI on collected test data:

jupyter notebook

and open test_dynamics_*ENV_NAME*.ipynb and go through each cell, including a Bayesian optimization (BO) process.

Train UP

cd ..
python train.py --train --env *ENV_NAME*dynamics --process NUM

Select the encoder-decoder type in ./environment/*ENV_NAME*dynamics.py to match with the one used in ./dynamics_predict/train_dynamics.py.

Test ESI with UP against other methods

cd dynamics_predict
python compare_methods_*ENV_NAME*.py

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
cfg		cfg
data/dynamics_data		data/dynamics_data
dynamics_predict		dynamics_predict
environment		environment
rl		rl
upesi_utils		upesi_utils
utils/__pycache__		utils/__pycache__
.gitignore		.gitignore
README.md		README.md
default_params_ppo.py		default_params_ppo.py
default_params_td3.py		default_params_td3.py
initialization.sh		initialization.sh
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UPESI

Installation

Citation:

Training Procedure

About

Releases

Packages

Languages

Robot-Learning-Library/upesi

Folders and files

Latest commit

History

Repository files navigation

UPESI

Installation

Citation:

Training Procedure

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages