diff --git a/index.html b/index.html index 960a655..60aaa14 100644 --- a/index.html +++ b/index.html @@ -141,16 +141,6 @@
We propose EPH (Ensembling Prioritized Hybrid Policies), a Q-learning-based MARL-MAPF solver with communication. Our key contributions include:
EPH. EPH can be divided into two parts, as shown in the above picture. The upper part shows the neural network structure of EPH, and how local partial observations are transformed into Q vectors. The lower part shows that instead of getting action by directly applying \( a^t_i = \text{argmax} (q^t_i) \), several inference techniques, as mentioned in the Contributions, can be used to improve actions quality and avoid collisions.
Training Method. The Q value for agent \(i\) is obtained via:
@@ -240,8 +232,8 @@Hybrid Expert Guidance
@@ -268,8 +260,8 @@@inproceedings{tang2024eph,
title={Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding},
@@ -350,18 +345,48 @@ BibTeX
Copy to clipboard