Analysis of Metrics for RL Testing

This repository contains code implementing learning-based testing of deep RL agents playing Super Mario Bros., where we track metrics computed from neural networks used by the RL agents. It accompanies the paper "Bridging the Gap Between Models in RL: Test Models vs. Neural Networks", submitted to the AMOST workshop 2024.

The implementation is an adaptation of the previous work on Differential Safety Testing of Deep RL Agents. The deep RL code is based on the PyTorch tutorial on training deep RL agents for Super Mario Bros. and we use the gym-super-bros environment.

Structure

The dependencies required for setting up the experiments are given environment.yml, which can be used together with Conda. The source code in the root directory contains the main implementation files and the directory stats includes some scripts to analyze and plot experimental results.

This is an initial version of our implementation, setup, and experiments.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
SUTs		SUTs
boundary_points		boundary_points
coverage_results		coverage_results
eval_results		eval_results
params		params
safety_ratios		safety_ratios
stats		stats
storm		storm
LICENSE		LICENSE
MarioTD.py		MarioTD.py
README.md		README.md
agent.py		agent.py
alergia.jar		alergia.jar
environment.yml		environment.yml
fuzzing.py		fuzzing.py
main.py		main.py
mario_reach_sampling.py		mario_reach_sampling.py
mdp_learning.py		mdp_learning.py
metrics.py		metrics.py
neural.py		neural.py
neuron_coverage.py		neuron_coverage.py
replay.py		replay.py
schedulers.py		schedulers.py
search.py		search.py
util.py		util.py
wrappers.py		wrappers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of Metrics for RL Testing

Structure

About

Releases

Packages

Languages

License

mtappler/smb-rl-testing-metrics

Folders and files

Latest commit

History

Repository files navigation

Analysis of Metrics for RL Testing

Structure

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages