Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the ability to specify classes to filter user-skill calculation #796

Merged
merged 2 commits into from
Nov 11, 2024

Conversation

AgentM-GEG
Copy link
Contributor

Context: The current version of the user-skill calculation is done on ALL detected classes present within a task (either the mean skill or that skill for all classes be above a certain skill_threshold). This creates a situation where, for a task with large number of classes OR imbalanced datasets, the user has to see at least N images per class before they get a chance to even be considered for leveling up.

Motivation: Research teams should be given the opportunity to provide specific classes using which they can judge the leveling up decision.

This PR:

  • The user_skill_reducer function now takes in focus_classes argument (default: None).
  • The solution involved a conditional statement saying if focus_classes are provided (e.g., ['square', 'triangle']), then compute the mean_skill, null_removed_classes, and null_removed_class_counts on these subset classes (instead of everything).
  • As such, the output still contains the entire confusion matrix (for all classes), but, the mean skill is computed on user-specified classes only.
  • A refactoring on lines 87-89 were done as this block of code is just repeated between if binary... else: ... statement, with the only difference being the null_class='False' in the binary case.

An example caeasar config looks as such: .../reducers/user_skill_reducer?mode='one-to-one'&count_threshold=5&focus_classes=['1', '2']&strategy='all'&skill_threshold=0.2

@AgentM-GEG
Copy link
Contributor Author

tagging @ramanakumars as well for visibility and crosschecking.

@lcjohnso
Copy link
Member

Hi @CKrawczyk -- Would you mind reviewing this PR?

Copy link
Collaborator

@CKrawczyk CKrawczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes look reasonable to me. It might be worth adding a new test to https://github.com/zooniverse/aggregation-for-caesar/blob/master/panoptes_aggregation/tests/reducer_tests/test_user_skill_reducer.py but this is not a blocker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lines 60-64 reflect a slightly different treatment of the focus_classes keyword argument input so that the tests can ingest this info.

@AgentM-GEG
Copy link
Contributor Author

@CKrawczyk I added a test for the focus_classes behavior and pushed those changes. I also changed a little bit of the reducer_wrapper code where the focus_classes argument is being parsed appropriately. Let me know how these changes look.

@lcjohnso lcjohnso requested a review from CKrawczyk November 7, 2024 22:15
Copy link
Collaborator

@CKrawczyk CKrawczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Let me know when you are happy to have it merged.

@AgentM-GEG
Copy link
Contributor Author

@CKrawczyk @lcjohnso , thank you! I am happy for it to be merged whenever works for either/both of you.

@CKrawczyk CKrawczyk merged commit c7505ab into zooniverse:master Nov 11, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants