Skip to content

Latest commit

 

History

History
42 lines (30 loc) · 2.1 KB

README.md

File metadata and controls

42 lines (30 loc) · 2.1 KB

Maze solving with different algorithms and their comparisons.

  1. With Value Iteraiton and Policy Iteration
  2. Q-learning
  3. Double-Q-learning
  4. SARSA (State-Action-Reward-State-Action)

1) VI and PI

image

2) Q-learning

3) Double-Q-learning

4) SARSA

SARSA using Q-learning

You can find the code for results below here. In which, we will collect the rewards for 5 runs and plot them together to see any patterns.

We see the common pattern that the rewards are initially bad, but as the number of episodes increases, the agent gets better and the reward reach an asymptote.