Skip to content

Latest commit

 

History

History
51 lines (30 loc) · 5.42 KB

README.md

File metadata and controls

51 lines (30 loc) · 5.42 KB

A generic MDP gym environment.

I am building this environment primarily for my Reinforcement Learning research. Purpose of this python module is to enable creation and simulation of Markov Decision Processes.

The environment is accessible through the OpenAI gym wrapper. An example to use it as follows.

import gym
import mdp_environment

env = gym.make("mdp-v0")
env.reset()
for _ in range(1000):
    _, _, done, _ = env.step(env.action_space.sample())
    if done:
        env.reset()

There are two custom MDP environments with the following details.

  • mdp-v0:

    • S:

    • A:

    • T:

    • R:

    • P:

    • γ: 1

  • mdp-v1:

    • S:

    • A:

    • T:

    • R:

    • P:

    • γ: 1

The MDP chain looks like this. For both the MPDs, the parameter N and p are adjustible. MDP transtion