Skip to content

WilliamYue37/t-DGR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

t-DGR

t-DGR: A Trajectory-Based Deep Generative Replay Method for Continual Learning in Decision Making

Figure 1 Figure 2

Installtion

Download mujoco210 and then run the following commands:

conda env create -f environment.yml
conda activate t-dgr
pip install -r requirements.txt

Getting Started

See the scripts/ folder for examples of how to train and evaluate methods. For example, to train t-DGR, run the following command:

./scripts/run_t-dgr.sh

Alternatively, try t-DGR by using our Google Colab Notebook.

Training

To train a model, run the following command:

python methods/<method_name>/train_<method_name>.py [--options]

To see the full list of options, run python methods/<method_name>/train_<method_name>.py --help.

Learner model checkpoints at the end of each task are saved to run/<ckpt-folder>/learner_ckpts/.

Evaluation

To evaluate a model, run the following command:

python methods/<method_name>/test.py [--options]

To see the full list of options, run python methods/<method_name>/test.py --help.

Datasets

The Continual World and GCL10 datasets used in the paper are located in datasets/continual_world/ and datasets/GCL10/, respectively. The script used to generate expert demonstrations is included in datasets/collect_data.py. To collect expert demonstrations, run the following command:

python datasets/collect_data.py [--options]

To see the full list of options, run python datasets/collect_data.py --help.

Citation

If you find t-DGR to be useful in your own research, please consider citing our paper:

@misc{yue2024tdgr,
    title={t-DGR: A Trajectory-Based Deep Generative Replay Method for Continual Learning in Decision Making}, 
    author={William Yue and Bo Liu and Peter Stone},
    year={2024},
    eprint={2401.02576},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Acknowledgements

Our diffusion model is based on Phil Wang's denoising-diffusion-pytorch repository. Our 1-D U-Net model is based on Michael Janner's diffuser repository.