Skip to content

Reinforcement Learning: Implementation of Vanilla REINFORCE algorithm. Solved two envs include discrete and continuous action spaces.

Notifications You must be signed in to change notification settings

coldhenry/RL-REINFORCE-Pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RL: Vanilla REINFORCE algorithms

REINFORCE algorithms

REINFORCE algorithm is the most basic policy grdient method that applies likelihood ratio policy gradient to learn a suitable policy.

pseudocode[1]

However, in my implementation, the policy gradient has combined with a baseline to increase stability. It is modified as followed:

Environment and Results

  • Discrete Action space : CartPole-v0
  • Continuous Action space: 2-link arm

Reference

[1] Reinforcement Learning: An Introduction

About

Reinforcement Learning: Implementation of Vanilla REINFORCE algorithm. Solved two envs include discrete and continuous action spaces.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages