Skip to content

Latest commit

 

History

History
213 lines (208 loc) · 23.4 KB

README.md

File metadata and controls

213 lines (208 loc) · 23.4 KB

awesome-rl-papers

Block MDP

  • Provable RL with Exogenous Distractors via Multistep Inverse Dynamics (ICLR2022 oral) arxiv [no code]
  • Provably efficient RL with Rich Observations via Latent State Decoding (ICML2019) arxiv code
  • Provable Rich Observation Reinforcement Learning with Combinatorial Latent States (ICLR2021) arxiv [no code]
  • Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning (ICML2020) arxiv [no code]
  • Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach (arxiv) arxiv code
  • Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning arxiv [no code]
  • On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP arxiv
  • Block Contextual MDPs for Continual Learning (ICLR2022 withdraw) openreview
  • Learning Domain Invariant Representations in Goal-conditioned Block MDPs arxiv code
  • On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP (ICML2021) pdf

Lifelong Learning, Continual Learning

  • Modular Lifelong Reinforcement Learning via Neural Composition (ICLR2022) arxiv [no code]
  • Generalisation in Lifelong Reinforcement Learning through Logical Composition (ICLR2022) arxiv [code bug]
  • Continual Learning via Local Module Composition (NIPS2021) arxiv code
  • Gradient Projection Memory for Continual Learning (ICLR2021 oral) arxiv code
  • Policy and value transfer in lifelong reinforcement learning. (ICML2018) arxiv [no code]
  • Lipschitz Lifelong Reinforcement Learning (AAAI2021) arxiv code
  • Towards Continual Reinforcement Learning: A Review and Perspectives arxiv
  • Fast reinforcement learning with generalized policy updates (PNAS) arxiv
  • Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement (ICML2018) arxiv [no code]
  • Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting (NIPS2020) arxiv
  • Policy Consolidation for Continual Reinforcement Learning (ICML2019) arxiv code
  • Continual Reinforcement Learning with Complex Synapses arxiv [no code]
  • Continuous Coordination As a Realistic Scenario for Lifelong Learning arxiv code1 code2
  • Lifelong Incremental Reinforcement Learning with Online Bayesian Inference (TNNLS) pdf code
  • Is Model-Free Learning Nearly Optimal for Non-Stationary RL? [ICML2021] arxiv [no code]

Generalization

  • Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL (ICLR2022) arxiv code
  • Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability (NIPS2021) arxiv
  • Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates (ICLR2022) arxiv [no code]
  • Environment Generation for Zero-Shot Compositional Reinforcement Learning (NIPS2021) arxiv code?
  • Reinforcement Learning with Prototypical Representations (ICML2021) arxiv code
  • Deep Reinforcement Learning amidst Continual Structured Non-Stationarity (ICML2021) arxiv
  • K-level Reasoning for Zero-Shot Coordination in Hanabi (NIPS2021) arxiv
  • Source tasks selection for transfer deep reinforcement learning: a case of study on Atari games
  • The Distracting Control Suite -- A Challenging Benchmark for Reinforcement Learning from Pixels pdf code
  • AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning (ICLR2022 spotlight) arxiv code
  • Improving zero-shot generalization in offline reinforcement learning using generalized similarity functions (ICLR2022 reject) openreview code
  • DARLA: Improving Zero-Shot Transfer in Reinforcement Learning (ICML2017) arxiv code
  • Case-based reasoning for better generalization in textual reinforcement learning (ICLR2022 poster) arxiv [[no code]]
  • Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks (ICML2021) arxiv code
  • Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning (ICML2021) arxiv code
  • On the Generalization of Representations in Reinforcement Learning (AISTATS22) arxiv code
  • Policy Architectures for Compositional Generalization in Control arxiv code
  • Leveraging procedural generation to benchmark reinforcement learning
  • Quantifying generalization in reinforcement learning
  • Decoupling value and policy for generalization in reinforcement learning code
  • On overfitting and asymptotic bias in batch reinforcement learning with partial observability
  • Improving generalization in reinforcement learning with mixture regularization (NIPS2020) code
  • Observational overfitting in reinforcement learning
  • Assessing generalization in deep reinforcement learning.
  • Neuro-algorithmic Policies enable Fast Combinatorial Generalization (ICML2021) [no code]
  • Self-supervised Visual Reinforcement Learning with Object-centric Representations (ICLR2021 spotlight) code
  • Transient Non-stationarity and Generalisation in Deep Reinforcement Learning (ICLR2021)
  • Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals (NIPS2020) (GNN)
  • SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies (ICML2021) code
  • Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion (AAAI2021) code
  • Planning to Explore via Self-Supervised World Models (ICML2020) arxiv code

Transfer learnning

  • Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers (ICLR2021) arxiv code
  • REPAINT: Knowledge Transfer in Deep Reinforcement Learning (ICML2021)

Multi-Task

  • Multi-Task Reinforcement Learning with Context-based Representations (ICML2021) arxiv code

Abstraction, Logical

  • Compositional Reinforcement Learning from Logical Specifications (NIPS2021) arxiv code
  • Learning Markov State Abstractions for Deep Reinforcement Learning (NIPS2021) arxiv code
  • R5: RULE DISCOVERY WITH REINFORCED AND RECURRENT RELATIONAL REASONING (ICLR2022) arxiv
  • A Theory of Abstraction in Reinforcement Learning pdf thesis
  • Model-Invariant State Abstractions for Model-Based Reinforcement Learning arxiv [no code]

Symbolic

  • EMERGENT SYMBOLS THROUGH BINDING IN EXTERNAL MEMORY (ICLR2021) arxiv code
  • Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients (ICLR2021) arxiv code
  • Discovering symbolic policies with deep reinforcement learning (ICML2021) arxiv
  • Iterated learning for emergent systematicity in VQA (ICLR2021 oral) arxiv

Auto RL

  • Evolving Reinforcement Learning Algorithms (ICLR2021 oral) arxiv
  • Discovering Reinforcement Learning Algorithms (NIPS2020) arxiv
  • CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning (NIPS2021w) arxiv

Evolutionary RL

  • Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design (ICLR2022 oral) arxiv code

Graph

  • Graph Convolutional Reinforcement Learning (ICLR2020) arxiv pytorch tf
  • Graph Policy Gradients for Large Scale Robot Control (CoRL2019 oral) arxiv code
  • Actor-Attention-Critic for Multi-Agent Reinforcement Learning (ICML2019) arxiv code
  • Symbolic Relational Deep Reinforcement Learning based on Graph Neural Networks
  • Efficient and Interpretable Robot Manipulation with Graph Neural Networks
  • Towards practical multi-object manipulation using relational reinforcement learning.
  • Neural task graphs:Generalizing to unseen tasks from a single video demonstration. (CVPR2019)

MARL

  • Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations and Alternative Solution arxiv
  • Multi-Agent Generative Adversarial Imitation Learning (NIPS2018) arxiv
  • Social Neuro AI: Social Interaction as the "dark matter" of AI arxiv
  • Emergent Social Learning via Multi-agent Reinforcement Learning (ICML2021) arxiv [no code]
  • Meta-brain Models: biologically-inspired cognitive agents arxiv
  • Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning arxiv [no code]
  • An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning (NIPS2021) arxiv code
  • Option-Critic in Cooperative Multi-agent Systems arxiv code
  • Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning (ICML2019) arxiv code
  • Joint Policy Search for Collaborative Multi-agent Imperfect Information Games (NIPS2020) arxiv code
  • Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction arxiv [no code]
  • Cooperative Exploration for Multi-Agent Deep Reinforcement Learning (ICML2021) arxiv code
  • A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning (ICML2021) arxiv code
  • Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition (ICML)
  • Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning (ICML2021)
  • Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory (ICLR2022)
  • Tensor Decomposition for Multi-agent Predictive State Representation
  • Learning with Opponent-Learning Awareness (AAMAS2018) arxiv code
  • Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts (IJCAI2021) arxiv code
  • Communication in multi-agent reinforcement learning: Intention sharing (ICLR2021)
  • Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation arxiv [no code]
  • Agent Modelling under Partial Observability for Deep Reinforcement Learning (NIPS2021) code

Auxiliary task, Representation learning

  • state-representaton-learning-rl blog
  • Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning (ICLR2021 oral) arxiv code
  • Learning Invariant Representations for Reinforcement Learning without Reconstruction arxiv code
  • Decoupling Representation Learning from Reinforcement Learning (ICML2021) arxiv code
  • Dealing with Non-Stationarity in MARL via Trust-Region Decomposition (ICLR2022) arxiv [no code]

Exploration

  • A Tutorial on Thompson Sampling pdf
  • When should agents explore? (ICLR2021 spotlight) arxiv
  • Principled Exploration via Optimistic Bootstrapping and Backward Induction (ICML2021)

Offline

  • NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning arxiv code

Low Rank MDP

  • Representation Learning for Online and Offline RL in Low-rank MDPs (ICLR2022 spotlight) arxiv
  • A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning (ICLR2022 reject) openreview
  • Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations (NIPS2021) arxiv

Sample Efficiency

  • Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation (ICLR2022 spotlight) arxiv

Interpretability

  • Programmatic Reinforcement Learning without Oracles (ICLR2022 spotlight) openreview [no code]

Sim2Real

  • Understanding Domain Randomization for Sim-to-real Transfer (ICLR2022 spotlight) arxiv

Hierarchical

  • Possibility Before Utility: Learning And Using Hierarchical Affordances (ICLR2022 spotlight) arxiv code
  • Hierarchical Reinforcement Learning: A Comprehensive Survey pdf
  • Hierarchical Multi-Agent Reinforcement Learning pdf
  • Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery (AAMAS2020) arxiv code
  • Graph-Based Skill Acquisition For Reinforcement Learning pdf
  • Compositional Reinforcement Learning from Logical Specifications (NIPS2021) code (Dijkstra)

POMDP

  • Deep Variational Reinforcement Learning for POMDPs (ICLR2018) arxiv code
  • Structured World Belief for Reinforcement Learning in POMDP (ICML2021) arxiv [no code]
  • An Efficient, Expressive and Local Minima-free Method for Learning Controlled Dynamical Systems (AAAI2018) arxiv code
  • Learning Latent Dynamics for Planning from Pixels (ICML2019) arxiv code
  • Recurrent Model-Free RL is a Strong Baseline for Many POMDPs
  • Reinforcement Learning in Rich-Observation MDPs using Spectral Methods
  • On Improving Deep Reinforcement Learning for POMDPs code
  • Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model (NIPS2020)
  • Planning from Pixels using Inverse Dynamics Models (ICLR2021) [no code]

Contrained rl

  • Density Constrained Reinforcement Learning (ICML2021) arxiv

Evolution

  • Trust Region Evolution Strategies (AAAI2019) pdf

Model based

  • Model-Based Reinforcement Learning via Latent-Space Collocation (ICML2021)

Binding

  • Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding (ICLR2021 oral) code

Review

  • Deep Reinforcement Learning: Opportunities and Challenges pdf
  • Reinforcement Learning in Robotics: A Survey pdf
  • Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects arxiv
  • A Survey of Generalisation in Deep Reinforcement Learning arxiv
  • Approximation Methods for Partially Observed Markov Decision Processes (POMDPs)

Tutorials

Misc

  • Contextual Decision Processes with Low Bellman Rank are PAC-Learnable (ICML2017) arxiv
  • Is the Policy Gradient a Gradient? pdf
  • Bayesian Reinforcement Learning: A Survey arxiv
  • Learning Good State and Action Representations via Tensor Decomposition arxiv
  • On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning (ICLR2022 spotlight) arxiv
  • Constrained Policy Optimization via Bayesian World Models openreview
  • Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution (ICML2017) pdf
  • A bayesian approach to problems in stochastic estimation and control pdf
  • On Proximal Policy Optimization's Heavy-tailed Gradients (ICML2021) arxiv
  • Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning (ICML2021) arxiv code
  • Muesli: Combining Improvements in Policy Optimization (ICML2021) arxiv code
  • Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision (ICML2021) arxiv [no code]
  • Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks (ICML2021) arxiv [no code]
  • Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective (ICML2021)
  • Temporal Predictive Coding For Model-Based Planning In Latent Space (ICML2021)
  • Neural codes: Firing rates and beyond. (beta distribution and spike coding)
  • Analysis and Improvement of Policy Gradient Estimation (NIPS2011) (variance of policy gradient estimator is inversely proportional to $\sigma^2$)
  • Recurrent predictive state policy networks
  • A recurrent latent variable model for sequential data.

Resourses