Human-expert normalized scores #52

ThisIsIsaac · 2019-08-11T03:20:09Z

The Rainbow DQN paper uses human-expert normalized scores, so I am not sure how to evaluate the training results against the original paper. Do you know what values were used for human expert scores?

I found snippets of the values used from papers here and there, but not sure if we can use the same number and how we can compute a single normalized value for all Atari games:

Kaixhin · 2019-08-11T16:36:49Z

Looks like I came up with a script in my Atari repo that can do this, but I can't remember where I got the details (must have scoured through lots of DQN papers). I'm not going to do it myself, but if you want to submit a PR that adds the computation and plotting of this score to test.py then I'd be happy to accept it.

ThisIsIsaac · 2019-08-11T23:29:29Z

The scores for some games are different from the ones from the DQN paper:

beam_rider:

DQN paper's human score: 7456
your code's human score: 5774.70

Enduro

DQN paper's human score: 368
your code's human score: 309.60

Qbert

DQN paper's human score: 18900
your code's human score: 13455.00

Pong

DQN paper's human score: -3
your code's human score: 9.30

Space invaders:

DQN paper's human score: 3690
your code's human score: 1652.30

Do you remember which papers you got these numbers from?

Kaixhin · 2019-08-12T09:28:54Z

Unfortunately not. Maybe you can email one of the authors of Rainbow to see if they can give you a list of the human rewards and also confirm the score calculation?

Kaixhin · 2019-08-17T21:25:43Z

Although they apparently got the human rewards from the original paper, you can check this paper for human rewards and evaluation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Human-expert normalized scores #52

Human-expert normalized scores #52

ThisIsIsaac commented Aug 11, 2019

Kaixhin commented Aug 11, 2019

ThisIsIsaac commented Aug 11, 2019

Kaixhin commented Aug 12, 2019

Kaixhin commented Aug 17, 2019

Human-expert normalized scores #52

Human-expert normalized scores #52

Comments

ThisIsIsaac commented Aug 11, 2019

Kaixhin commented Aug 11, 2019

ThisIsIsaac commented Aug 11, 2019

Kaixhin commented Aug 12, 2019

Kaixhin commented Aug 17, 2019