Mean reciprocal rank (MRR) of leaderboard scores #3451
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@ mention of reviewers
@Didayolo
Purpose of the changes contained in this PR.
For a new competition that we will launch soon on CodaLab, we would like to use the mean reciprocal rank (MRR) as a summary metrics for the overall evaluation of a group of scores in the leaderboard. However, at the moment, Codalab only allows using the (weighted) average operation over the rank of scores.
This is mentioned in some comments on #2736 and #3449.
Issues this PR resolves
This PR includes the operation to calculate the MRR over leaderboard scores.
A checklist for hand testing
Any relevant files for testing
I have created a modified IRIS competition bundle that has, as leaderboard columns, both the MRR and the Avg of the original metrics (Prediction Score and Duration).
Checklist