Reinforcement-Learning-Reproducible-research-assignment

Information & Authors

Authors:

Jochem Soons - 11327030
Jeroen van Wely - 11289988
Niek IJzerman - 11318740

This repository contains the code for Reproducible Research assignment. The assignment is part of the Reinfocement Learning course of the Master's programme Artificial Intelligence at the University of Amsterdam. The link to the corresponding medium paper: https://jochemsoons.medium.com/a-comparison-between-sarsa-and-expected-sarsa-66b931202c75

Files Included

run_experiments.py
- Contains the code to define and run the experiment and get the plots.
windy_gridworld.py
- Contains the code for the windy_gridworld class.
- The majority of the code is taken from another source. For more details check the comments at the top of the file.
models.py
- Contains the code for the Sarsa algorithm.
- Contains the code for the Expected Sarsa algorithm.
policies.py
- Contains the code for the epsilon greedy policy class.
windy_gridworld_experiment.py
- Contains the code to run the windy grid world experiment, which was defined in run_experiment.py, on either Sarsa or Expected Sarsa.

Requirements

We used a conda environment for running our code, that we exported as yml file: see environment.yml.

To create the environment, run:

conda env create -f environment.yml

To activate the environment, run:

conda activate rlproject

How to run the code

To run the code:

All files will have to be within the same directory.
To run an experiment run: "python3 run_experiments.py" Followed by command line argument specification.
An example would be: python3 run_experiments.py --num_episodes=1000 --discount_factor=1.0 --alpha=0.5 --epsilon=0.1 --epochs=100 --average_over_n=50 --no_extra_actions --no_add_stochasticity
To run the experiment using extra actions one would have to change "--no_extra_actions" to "--extra_actions". Likewise to add stochasticity change "--no_add_stochasticity" to "--add_stochasticity".

Parameters & Command Line Arguments

num_episodes:
- Number of episodes per run.
- Default value: 1000
discount_factor:
- Factor that controls how much we care about future actions.
- Default value: 1.0
alpha:
- Update step size
- Default: 0.5
epsilon:
- Probability of picking random action.
- Default value: 0.1
epochs:
- Number of runs we want to average the plots over.
- Default value: 100
average_over_n:
- Smooth the graph by averaging over every n episodes.
- Default value: 50
extra_actions:
- To set extra_actions to True and thus incorperate 8 actions: --extra_actions
- To set extra_actions to False and thus incorperate 4 actions: --no_extra_actions
- Default value: False
add_stochasticity:
- To set add_stochasticity to True and thus incorperate stochasticity: --add_stochasticity
- To set add_stochasticity to False and thus incorperate stochasticity: --no_add_stochasticity
- Default value: False

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reinforcement-Learning-Reproducible-research-assignment

Information & Authors

Files Included

Requirements

How to run the code

Parameters & Command Line Arguments

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README.md		README.md
environment.yml		environment.yml
models.py		models.py
policies.py		policies.py
run_experiments.py		run_experiments.py
windy_gridworld.py		windy_gridworld.py
windy_gridworld_experiment.py		windy_gridworld_experiment.py

Jeroenvanwely/Reinforcement-Learning-Reproducible-research-assignment

Folders and files

Latest commit

History

Repository files navigation

Reinforcement-Learning-Reproducible-research-assignment

Information & Authors

Files Included

Requirements

How to run the code

Parameters & Command Line Arguments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages