You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By coding for retrieving performance curve as shown in chapter "analyzing Results from Monte Carlo (chapter 4) study" Approach 1 (Proba of selecting best arm) Approach 2 (Average Reward), I noticed that changing the order of the arm, may change dramatically the curves.
For instance if I declare the mean of my five arms as follow : means=[0.8, 0.9, 0.1, 0.5, 0.5] n_arms=len(means) random.shuffle(means) arms=[BernoulliArm(mu) for mu in means]
The shuffle change the order within means list, therefore the order of the arms, and the performance curve may be very different.
for instance, considering [0.9, 0.5, 0.1, 0.5, 0.8] as order after shuffling I get this curve (average reward per time):
whereas considering [0.1, 0.5, 0.9, 0.5, 0.8] , I get
Do you have an explanation ? Same phenomenon for proba of selecting best arm (but no so much).
Parameters:
num_sims=1000
horizon=250
epsilon=0.1
The text was updated successfully, but these errors were encountered:
By coding for retrieving performance curve as shown in chapter "analyzing Results from Monte Carlo (chapter 4) study" Approach 1 (Proba of selecting best arm) Approach 2 (Average Reward), I noticed that changing the order of the arm, may change dramatically the curves.
For instance if I declare the mean of my five arms as follow :
means=[0.8, 0.9, 0.1, 0.5, 0.5] n_arms=len(means) random.shuffle(means) arms=[BernoulliArm(mu) for mu in means]
The shuffle change the order within means list, therefore the order of the arms, and the performance curve may be very different.
for instance, considering [0.9, 0.5, 0.1, 0.5, 0.8] as order after shuffling I get this curve (average reward per time):
whereas considering [0.1, 0.5, 0.9, 0.5, 0.8] , I get
Do you have an explanation ? Same phenomenon for proba of selecting best arm (but no so much).
Parameters:
num_sims=1000
horizon=250
epsilon=0.1
The text was updated successfully, but these errors were encountered: