Skip to content

SJHNJU/Grid-World

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Grid-World

Let the Robot find a way in the Grid World!

Avoid (1, 3) and heading to (0, 3). Plus, (1, 1) is wall!
init P={0.1, 0.1, 0.8} for robot to turn left or right or go forward when it gets the order to go straight!
fun thing is: recommend route changes when P comes to {0.1, 0.5, 0.85} & {0.1, 0.1, 0.89}
check!
  • when P={0.1, 0.1, 0.8}, recommed route is:
    2

  • but when P={0.1, 0.05, 0.85}, route change to:
    1

  • and if P={0.1, 0.01, 0.89}, route is:
    3

Check! it is fun when it comes to challenge how we look at the things

method:

  • Value iteration in reinforcement learning
Repeat:
expectation(a) = sum(Ps,a(s_) * V(s_) for s_ in S_) ----S_ is the probable next state when taking action a in state s
V(S) := R(S) + max( gamma * expectation(a) for a in A) --- A is the available actions in state S
and:
V(S) will be close to V*(S) --V(S) -> V*(S)

find best policy:
optimal policy equation = argmax(expectation(a) for a in A)
  • Q-learning
target:generate a state-action-value table
Q(s,a) := (1- alpha)Q(s,a) + alpha[R(s,a) + gamma * maxQ(s_,a)]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages