Skip to content
Tomáš Gavenčiak edited this page Jan 3, 2019 · 3 revisions

Collection of design choices made or to be considered.

Game types and classes

  • Partial information, custom observations: Game
  • Full information (state observation) but simultaneous moves: SimultaneousGame
    • Partial information simultaneous game is implemented just as Game as there are no significant advantages.
  • Partial information, perfect recall sequence-based observations: ObservationGame
  • Full information game (state observation) CompleteInformationGame

Note that if update_state in a complete information game depends on history, observing state information is not enough!

Game specification style

Internal game state can be updated in place or immutable and always created anew. ActivePlayer can be renamed to StateInfo. StateInfo should also contain (new/all) observations.

Some variants of update_state could accept Situation instead of State. While implementations could use some of the information there (history, observations, rewards, past StateInfo), it could also make the API more brittle. (But I would prefer situation anyway - Tomas).

Having Game.state_info() may be slower (recalculation or caching of some info computed on update) and may not be cleaner.

Const state

  • Game.initial_state() -> (state, state_info)
  • Game.update_state(situation, action) -> (state, state_info)

Separate StateInfo

  • Game.initial_state() -> state
  • Game.update_state(situation, action) -> state
  • Game.state_info(state) -> state_info

Updating state in-place

  • Game.initial_state() -> (state, state_info)
  • Game.update_state(state, action) -> state_info

Separate info

  • Game.initial_state() -> state
  • Game.update_state(state, action)
  • Game.state_info(state) -> state_info

Action space

Fixed action set [proposed]

One fixed action list for a Game instance (e.g. in Game.actions), every action list is a subset, indexing always into this set.

(+): Easy to record and interpret (just indices), easy to encode as NN output.

(-): Some games may be hard to fit (examples? but games with huge action sets (card shuffling) are hard to learn anyway).

Arbitrary (open) action set [current]

(+): Supports large sets, seems flexible

(-): Type mishmashes (indices vs. actual numerical actions), reverse indexing, unintended very large action sets (e.g. floats).

Games with unlimited/large action sets

  • Shuffle whole deck (infeasible anyway, can be replaced by drawing indiv. cards)
  • Select from arbitrary many cards (e.g. Port Royal card selection)
    • In Port Royal, you can just offer all card types (would work better anyway)
  • Some other card games?
  • Bidding games have large/unbounded bids (quantize? how does deepstack do it?)

Solutions

  • Allow one unbounded set of actions (e.g. negative indices or all indices beyond last action)
    • Encoding from NN still has to be customized (e.g. bidding quantization, recurrent NN decoder, ...)
Clone this wiki locally