This package aims to provide a Rummikub implementation with a machine learning flavor
Example of search function for board game: Expectiminimax
cf: https://github.com/deeplearning4j/rl4j
MDP 5-tuple (S, A, T, γ, R.):
– S = {s1, s2,...} is the possibly infinite set of states the environment can be in. All the tiles on the tables in the rack? – A = {a1, a2,...} is the possibly infinite set of actions the agent can take. Tile allowed at any given point in the game. Basically the tiles that can be played on the table? – T(s|s, a) defines the probability of ending up in environment state after taking action a in state s. Depends on other players decision and random picks from the pool? – γ ∈ [0, 1] is the discount factor, which defines how important future rewards are. The value of the tile that I can put on the table ? – R(s, a, s1) is the possibly stochastic reward given for a state transition from s to s1 through taking action a. It defines the goal of an agent interacting with the MDP, as it indicates the immediate quality of what the agent is doing.
Look at [5] $4
In our case Q(s, a) expected value of taking action a in state s.
- [1] Abstracting Reusable Cases from Reinforcement Learning
- [2] A brief tutorial on reinforcement learning: The game of Chung Toi
- [3] Reinforcement Learning for Board Games: The Temporal Difference Algorithm
- [4] Multi-Stage Temporal Difference Learning for 2048-like Games
- [5] A Gentle Introduction to Reinforcement Learning
- [6] AI Mahjong
- [7] DominAI
- [8] Giraffe