Reinforcement Learning Notebooks Minimal implementations of different RL algorithms. Advantage Actor Critic (A2C) - a2c_tf_cartpole.ipynb Vanilla Policy Gradient - pg_tf_cartpole.ipynb