Skip to content

PLEX-GR00T/Maze_solving_MDP

Repository files navigation

Maze solving with different algorithms and their comparisons.

  1. With Value Iteraiton and Policy Iteration
  2. Q-learning
  3. Double-Q-learning
  4. SARSA (State-Action-Reward-State-Action)

1) VI and PI

image

2) Q-learning

3) Double-Q-learning

4) SARSA

SARSA using Q-learning

You can find the code for results below here. In which, we will collect the rewards for 5 runs and plot them together to see any patterns.

We see the common pattern that the rewards are initially bad, but as the number of episodes increases, the agent gets better and the reward reach an asymptote.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published