Skip to content

wenshuaizhao/optimappo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Optimistic Multi-Agent Policy Gradient

This is the code for optimappo (paper, website) which enables otpimism in multi-agent policy gradient methods by shaping the advantage estimation. This is a simple, but effective way to improve MAPPO on deterministic tasks by overcoming the relative overgeneralization problem.

Installation

Train your optimistic MAPPO (optimappo)

cd scripts
./train_mujoco_local.sh

Expected results

Performance on MaMuJoCo

Citation

If you found this code is useful for your work, please cite our paper:

@inproceedings{zhao2024optimistic,
        title={Optimistic Multi-Agent Policy Gradient},
        author={Zhao, Wenshuai and Zhao, Yi and Li, Zhiyuan and Kannala, Juho and Pajarinen, Joni},
        booktitle={Proceedings of the International Conference on Machine Learning},
        year={2024}
      }

Releases

No releases published

Packages

No packages published