This is a collection of recent MARL papers with their codes and talks (if available). Some highly-revelant (single-agent RL, multi-agent perception) papers might be also included.
To include your papers, please submit a new issue and enclose the information of your paper (citation, website, codebase, etc.). Thank you.
Maintainer: Yaru Niu
Email: [email protected]
- ALMA: Hierarchical Learning for Composite Multi-Agent Tasks. NeurIPS 2022. [paper].
- Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning. NeurIPS 2020. [paper][talk][website]
- RODE: Learning Roles to Decompose Multi-Agent Tasks. ICLR 2021. [paper][code][talk]
- OPtions as REsponses: Grounding Behavioural Hierarchies in Multi-Agent Reinforcement Learning. ICML 2020. [paper][talk]
- MAVEN: Multi-Agent Variational Exploration. NeurIPS 2019. [paper][code]
- Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery. AAMAS 2020. [paper][code]
- Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition. ICML 2021. [paper][code][talk]
- Feudal Multi-Agent Deep Reinforcement Learning for Traffic Signal Control. AAMAS 2020. [paper][relevant code]
- Feudal Multi-Agent Hierarchiesfor Cooperative Reinforcement Learning. arXiv 2019. [paper][talk]
- [not MARL] FeUdal Networks for Hierarchical Reinforcement Learning. ICML 2017. [paper][talk]
- Option-Critic in Cooperative Multi-Agent Systems. arXiv 2020 (abridged as an extended abstract in AAMAS 2020). [paper][extented abstract]
- Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction. arXiv 2019. [paper]
- [not MARL] Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation. NIPS 2016. [paper][code][talk]
- Credit Assignment with Meta-Policy Gradient for Multi-Agent Reinforcement Learning. arXiv 2021. [paper]
- Meta-CPR: Generalize to Unseen Large Number of Agents with Communication Pattern Recognition Module. arXiv 2022. [paper]
- RODE: Learning Roles to Decompose Multi-Agent Tasks. ICLR 2021. [paper][code][talk]
- ROMA: Multi-Agent Reinforcement Learning with Emergent Roles. ICML 2020. [paper][code][talk][website]
- Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts. IJCAI 2021. [paper][code]
- Multi-Agent Graph-Attention Communication and Teaming. AAMAS 2021 (Best Paper Award at MAIR2@ICCV'21). [paper][paper (MAIR2)][code][talk][talk (MAIR2)]
- Learning Correlated Communication Topology in Multi-Agent Reinforcement Learning. AAMAS 2021. [paper][talk]
- TarMAC: Targeted Multi-Agent Communication. ICML 2019. [paper][talk]
- Learning to Schedule Communication in Multi-agent Reinforcement Learning. ICLR 2019. [paper][code]
- Learning Multiagent Communication with Backpropagation. NIPS 2016. [paper]
- Learning Efficient Multi-agent Communication: An Information Bottleneck Approach. ICML 2020. [paper][code][talk]
- Asynchronous Multi-Agent Reinforcement Learning for Efficient Real-Time Multi-Robot Cooperative Exploration. AAMAS 2023. [paper]
- Multi-Agent Incentive Communication via Decentralized Teammate Modeling. AAAI 2022. [paper][code]
- Self-Organized Group for Cooperative Multi-agent Reinforcement Learning. NeurIPS 2022. [paper][supplementary]
- Complementary Attention for Multi-Agent Reinforcement Learning. ICML 2023. [paper][code]
- Towards True Lossless Sparse Communication in Multi-Agent Systems. ICRA 2023. [paper][talk]
- Learning Multi-Agent Communication from Graph Modeling Perspective. ICLR 2024. [paper][code]
- Order Matters: Agent-by-agent Policy Optimization. ICLR 2023. [paper][code][中文blog][talk]
- Revisiting Some Common Practices in Cooperative Multi-Agent Reinforcement Learning. ICML 2022. [paper][website]
- Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition. ICML 2021. [paper][code][talk]
- Multi-Agent Collaboration via Reward Attribution Decomposition. arXiv 2020. [paper][code][website]
- Online Ad Hoc Teamwork under Partial Observability. ICLR 2022. [paper]
- Open Ad Hoc Teamwork using Graph-based Policy Learning. ICML 2021. [paper][code][talk]
- NerveNet: Learning Structured Policy with Graph Neural Networks. ICLR 2018. [paper][code][website][video]
- One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control. ICML 2020. [paper][code][talk][website]
- Snowflake: Scaling GNNs to High-Dimensional Continuous Control via Parameter Freezing. NeurIPS 2021. [paper][talk]
- Multi-Agent Reinforcement Learning is A Sequence Modeling Problem (Multi-Agent Transformer). NeurIPS 2022. [paper][code][website]
- Updet: Universal Multi-Agent Reinforcement Learning via Policy Decoupling with Transformers. ICLR 2021. [paper][code]
- Meta-CPR: Generalize to Unseen Large Number of Agents with Communication Pattern Recognition Module. arXiv 2022. [paper]
- Adaptable and Scalable Multi-Agent Graph-Attention Communication. Georgia Tech MS Thesis. [paper]
- Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis. arXiv 2022. [paper]
- MAVIPER: Learning Decision Tree Policies for Interpretable Multi-agent Reinforcement Learning. ECML-PKDD 2022. [paper]
- Multi-Agent Manipulation via Locomotion using Hierarchical Sim2Real. CoRL 2018. [paper][website]
- A framework for Real-World Multi-Robot Systems Running Decentralized GNN-Based Policies. ICRA 2022. [paper][code 1][code 2][video]
- Reinforcement Learned Distributed Multi-Robot Navigation with Reciprocal Velocity Obstacle Shaped Rewards. RA-L, 2022. [paper][code][video]
- Solving Multi-Entity Robotic Problems Using Permutation Invariant Neural Networks. arXiv 2024. [paper]
- MQE: Unleashing the Power of Interaction with Multi-agent Quadruped Environment. arXiv 2024. [paper][code][website]
- COMPOSER: Scalable and Robust Modular Policies for Snake Robots. ICRA 2024. [paper][website]
- Learning Decentralized Multi-Biped Control for Payload Transport. CoRL 2024. [paper][website][code]
- Learning Multi-Agent Loco-Manipulation for Long-Horizon Quadrupedal Pushing. arXiv 2024. [paper][website][code]
- Multi-Agent Constrained Policy Optimisation. arXiv 2022. [paper][code]
- Safe Multi-Agent Isaac Gym Benchmark (Safe MAIG). [code]
- DiscoNet: Learning Distilled Collaboration Graph for Multi-Agent Perception. NeurIPS 2021. [paper][code]
- Coopernaut: End-to-End Driving with Cooperative Perception for Networked Vehicles. CVPR 2022. [paper][code][website]
- V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction. ECCV 2020. [paper][code]
- OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication. ICRA 2022. [paper][code][website]
- When2com: Multi-Agent Perception via Communication Graph Grouping. CVPR 2020. [paper][code][website]
- Who2com: Collaborative Perception via Learnable Handshake Communication. ICRA 2020. [paper][website]
- V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer. ECCV 2022. [paper][code]
- V2X-Sim: Multi-Agent Collaborative Perception Dataset and Benchmark for Autonomous Driving. RA-L 2022. [paper][code][website]
- Collaborative Multi-Object Tracking With Conformal Uncertainty Propagation. RA-L 2024. [paper][website]
- An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective. arXiv 2021. [paper]
- Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms. arXiv 2021. [paper]
- Multi-Agent Reinforcement Learning: Methods, Applications, Visionary Prospects, and Challenges. arXiv 2023. [paper]
- ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination. NeurIPS 2024 (Track on Datasets and Benchmarks). [paper][code].
- JaxMARL: Multi-Agent RL Environments in JAX. arXiv 2023. [papaer][code][blog]
- Python MARL (PyMARL) framework. [code]
- Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning (PyMARL2). arXiv 2023. [paper][code]
- Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks (EPyMARL). NeurIPS 2021 (Track on Datasets and Benchmarks). [paper][code][talk][blog]
- The StarCraft Multi-Agent Challenge. AAMAS 2019. [paper][code]
- SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning. NeurIPS 2023 (Track on Datasets and Benchmarks). [paper][code]
- FACMAC: Factored Multi-Agent Centralised Policy Gradients (introducing multi-agent Mujoco). arXiv 2020. [paper][code]
- TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems (PyMARL Transformers). AAMAS 2023. [paper][code]
- The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games. NeurIPS 2022 (Track on Datasets and Benchmarks). [paper][code (on-policy)][code (off-policy)][talk]
- Trust Region Policy Optimisation in Multiagent Reinforcement Learning. ICLR 2022. [paper][code][talk]
- MARLlib: Extending RLlib for Multi-agent Reinforcement Learning. arXiv 2022. [paper][code][website]
- Multi-Agent Constrained Policy Optimisation. arXiv 2022. [paper][code]
- MATE: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control. NeurIPS 2022 (Track on Datasets and Benchmarks). [paper][code][webpage]
- Safe Multi-Agent Isaac Gym Benchmark (Safe MAIG). [code]
- Settling the Variance of Multi-Agent Policy Gradients. NeurIPS 2021. [paper][code]
- Towards Human-Level Bimanual Dexterous Manipulation with Reinforcement Learning. NeurIPS 2022. [paper][code]
- Heterogeneous-Agent Reinforcement Learning. [code]
- MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning. JMLR 2023. [paper][code]
- Nocturne: a scalable driving benchmark for bringing multi-agent learning one step closer to the real world. NeurIPS 2022 (Track on Datasets and Benchmarks). [paper][code][website][talk]
- ScenarioNet: Open-Source Platform for Large-Scale Traffic Scenario Simulation and Modeling. NeurIPS 2023. [paper][code][website]
- MQE: Unleashing the Power of Interaction with Multi-agent Quadruped Environment. arXiv 2024. [paper][code][website]