The goal of the game is to identify the true diagnosis from a set of diagnoses by sequentially making actions to reveal information about the diagnosis.
Given a set $\mathcal D = { d_i }{i=1}^D$ of $D$ of possible diagnoses, the goal of the game is to identify the true diagnosis, $d*$.
The player starts with no information about the true diagnosis.
In earch turn the player plays an action (performs a test)
After each action feedback is given that reduces the set of possible diagnoses. In other words we start with
A diagnosis can be found by an exhaustive search, however exhaustive searches are often time or cost prohibitive.
In such situations where exhaustive searches are impractical one must resort to following a policy in order to find the diagnosis in some sense of optimality.
Definitions of optimality vary and are contended. Commonly used metrics are:
- smallest mean number of guesses
- fewest cumulative guesses
- least number of failed diagnoses
Noting that these metrics are aggregates and are therefore computed over the set of all possible diagnoses.
- Is there a globally optimal policy?
- Why or why not?
- Can efficient policies be developed for all variants?
- Can we devleop policies for huge problems?
- What approximiations can be deployed and can guarantees be developed for their success?
In most games the feedback given is correct. However in real world situations tests can be innacurate or even incorrect. Therefore we need to consider that the feedback we get is random.
Some research has been undertaken for a probabilistic version of Mastermind: Bayesian networks in Mastermind.
Requirements for repeated testing
- If highly unusual outcome from test is received which would normally be re-tested how do we incorporate this into the maximum information criterion?
In most games all action have equal costs. However in real world situations there may be different costs associated with each action.
Examples:
- how long it takes to perform an action
- how much money it costs to perform an action
- quality of life costs e.g. invasive surgery or procedures
Usually, when a patient is being investigated, only a small subset of all available tests is performed. Most selection methods for making this choice fail to account for the risk and cost of the test. By attempting to approximate a decision-analytic ideal, via the concept of quasi-utility, the authors developed the information-to-cost ratio and related measures, which balance the utility of the information gained against the price paid.
These extensions introduce new optimality criteria, which must be satisfied alongside original criteria. In these situations one might accept any of the Pareto optimal solutions.
- Royal Flying Doctor Service, "Where does it hurt?" radio aid chart