Skip to content

Replication of Richard Sutton's 1988 results exploring temporal difference learning and the TD(lambda) algorithm.

Notifications You must be signed in to change notification settings

vjp23/TD_Lambda

Repository files navigation

TD(λ) Learning and Random Walks

by Vince Petaccio

Reproducing Sutton's 1988 Results

The goal of this work was to replicate the results found in Richard Sutton's 1988 paper entitled Learning to Predict by the Methods of Temporal Differences:

Sutton's Results

The resulting figures confirm the reproducability of Sutton's results:

Reproduction of Sutton's figure 3

Reproduction of Sutton's figure 4

Reproduction of Sutton's figure 5

See the analysis for a more in-depth review of the original paper and of these reproduced results.

Running the Code

Ensure that the NumPy and Matplotlib libraries are installed on the system and are activated within the environment.

Open td_learning.py and set the appropriate experiment switches to True at the top of the file, and then run the code to run the experiments. Note the runtimes listed next to each switch; because many values are calculated for smooth plots, some of the experiments take a while.

About

Replication of Richard Sutton's 1988 results exploring temporal difference learning and the TD(lambda) algorithm.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages