Multi Domain and Multi Task Deep Reinforcement Learning for Continuous Control - Using Hard parameter sharing Deep Neural Networks ("Multi Headed Network") as the policy and value function approximator to enable a single Reinforcement Learning agent to learn multi tasks and domains in parallel.
Trained using Proximal Policy Optimization (PPO) and KL-divergence constraint from TRPO. Environments from Open AI Gym.
-
Disseration document describing the project: MTRL_disseration.pdf
-
Video showing Single and Multi Domain/Task Results: Summary Video
David C. Alvarez-Charris, Chenyang Zhao, Timothy Hospedales University of Edinburgh - 2018 Dissertation for MSc in Artificial Intelligence