Epsilon Greedy: graph for Accuracy, Performance...sensitive to the declaration order of the arm ? #21

phsimon · 2022-02-28T14:27:06Z

By coding for retrieving performance curve as shown in chapter "analyzing Results from Monte Carlo (chapter 4) study" Approach 1 (Proba of selecting best arm) Approach 2 (Average Reward), I noticed that changing the order of the arm, may change dramatically the curves.
For instance if I declare the mean of my five arms as follow :
means=[0.8, 0.9, 0.1, 0.5, 0.5] n_arms=len(means) random.shuffle(means) arms=[BernoulliArm(mu) for mu in means]

The shuffle change the order within means list, therefore the order of the arms, and the performance curve may be very different.

for instance, considering [0.9, 0.5, 0.1, 0.5, 0.8] as order after shuffling I get this curve (average reward per time):

whereas considering [0.1, 0.5, 0.9, 0.5, 0.8] , I get

Do you have an explanation ? Same phenomenon for proba of selecting best arm (but no so much).

Parameters:
num_sims=1000
horizon=250
epsilon=0.1

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Epsilon Greedy: graph for Accuracy, Performance...sensitive to the declaration order of the arm ? #21

Epsilon Greedy: graph for Accuracy, Performance...sensitive to the declaration order of the arm ? #21

phsimon commented Feb 28, 2022

Epsilon Greedy: graph for Accuracy, Performance...sensitive to the declaration order of the arm ? #21

Epsilon Greedy: graph for Accuracy, Performance...sensitive to the declaration order of the arm ? #21

Comments

phsimon commented Feb 28, 2022