Maze solving with different algorithms and their comparisons.

With Value Iteraiton and Policy Iteration
Q-learning
Double-Q-learning
SARSA (State-Action-Reward-State-Action)

1) VI and PI

2) Q-learning

3) Double-Q-learning

4) SARSA

SARSA using Q-learning

You can find the code for results below here. In which, we will collect the rewards for 5 runs and plot them together to see any patterns.

We see the common pattern that the rewards are initially bad, but as the number of episodes increases, the agent gets better and the reward reach an asymptote.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Output Must Watch		Output Must Watch
.gitattributes		.gitattributes
Final_VI_PI.ipynb		Final_VI_PI.ipynb
Final_VI_PI.py		Final_VI_PI.py
Q-learning_and_SARSA_on_maze.ipynb		Q-learning_and_SARSA_on_maze.ipynb
Q_learning_(Taxi_v3).ipynb		Q_learning_(Taxi_v3).ipynb
README.md		README.md
double_q_learning_exponential.ipynb		double_q_learning_exponential.ipynb
double_q_learning_linear.ipynb		double_q_learning_linear.ipynb
q_learning_exponential.ipynb		q_learning_exponential.ipynb
q_learning_linear.ipynb		q_learning_linear.ipynb
sarsa_exponential.ipynb		sarsa_exponential.ipynb
sarsa_linear.ipynb		sarsa_linear.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Maze solving with different algorithms and their comparisons.

1) VI and PI

2) Q-learning

3) Double-Q-learning

4) SARSA

SARSA using Q-learning

About

Releases

Packages

Languages

PLEX-GR00T/Maze_solving_MDP

Folders and files

Latest commit

History

Repository files navigation

Maze solving with different algorithms and their comparisons.

1) VI and PI

2) Q-learning

3) Double-Q-learning

4) SARSA

SARSA using Q-learning

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages