-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intake lists #52
Intake lists #52
Conversation
Codecov Report
@@ Coverage Diff @@
## master #52 +/- ##
==========================================
- Coverage 98.36% 98.26% -0.10%
==========================================
Files 24 24
Lines 855 865 +10
==========================================
+ Hits 841 850 +9
- Misses 14 15 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@gwaygenomics This last Commit should fix every thing. So this is ready for review :) |
Fixes #51 |
Ah, I still need to run the Demo notebook |
@gwaygenomics I reran the demo, All up to do date now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work. One main discussion point: Do we want to force the user to input a list, or should we allow single elements as well?
A supplementary point is to please add back the Cell Painting demo notebook (I think you accidentally deleted it)
cytominer_eval/evaluate.py
Outdated
grit_control_perts: List[str] = ["None"], | ||
grit_replicate_summary_method: str = "mean", | ||
mp_value_params: dict = {}, | ||
enrichment_percentile: float = 0.5, | ||
enrichment_percentile: List[float] = [0.99, 0.98], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about accepting both single values and lists - do you agree?
Also, nice to see that you're suggesting a more reasonable default after your experimentation!
cytominer_eval/evaluate.py
Outdated
@@ -129,7 +130,7 @@ def evaluate( | |||
metric_result = precision_recall( | |||
similarity_melted_df=similarity_melted_df, | |||
replicate_groups=replicate_groups, | |||
k=precision_recall_k, | |||
k_list=precision_recall_k, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will need to revisit this decision pending discussion on accepting both ints and list of ints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
) -> dict: | ||
similarity_melted_df: pd.DataFrame, | ||
replicate_groups: List[str], | ||
percentile: List[float], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just making sure that you're intentionally removing the default of 0.9.
Also, this will need to change depending on how we decide users should interact (single int or list of ints)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -28,48 +30,54 @@ def enrichment( | |||
replicate_groups : List | |||
a list of metadata column names in the original profile dataframe to use as | |||
replicate columns. | |||
percentile : float | |||
enrichment_percentile : List of floats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about might need to change.
also, shouldn't this just be percentile
instead of enrichment_percentile
? If not, then u need to change the function argument name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enrichment_percentile is still not the name of the function argument (line 19 above) do you see what I mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thinks it fine how it is now. Do you think I should change the name of the variable in enrichment.py to enrichment_percentile instead of just percentile?
similarity_melted_df: pd.DataFrame, | ||
replicate_groups: List[str], | ||
k: int, | ||
similarity_melted_df: pd.DataFrame, replicate_groups: List[str], k_list: List[int], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment about int vs. list[int] decisions.
Also, I LOVE the fact that you're making this enhancement all in one shot, and not just to the enrichment metric. So useful to have all these changes in one PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
assert result_df.shape == (7, 4) | ||
assert result_df.percentile[0] == 1.0 | ||
assert result.shape == (7, 4) | ||
assert result.enrichment_percentile[0] == 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a check to the second element in this Series?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert that result.enrichment_percentile[1]
equals what you'd calculate by hand.
Co-authored-by: Greg Way <[email protected]>
@gwaygenomics should be done with all your comments now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only minor comments, but they should be addressed before merge.
@@ -28,48 +30,54 @@ def enrichment( | |||
replicate_groups : List | |||
a list of metadata column names in the original profile dataframe to use as | |||
replicate columns. | |||
percentile : float | |||
enrichment_percentile : List of floats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
enrichment_percentile is still not the name of the function argument (line 19 above) do you see what I mean?
@@ -160,7 +156,7 @@ def test_evaluate_precision_recall(): | |||
replicate_groups=["Metadata_broad_sample"], | |||
operation="precision_recall", | |||
similarity_metric="pearson", | |||
precision_recall_k=k, | |||
precision_recall_k=[k], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you try one of these without a list? (i mean one of either lines 144 or 163)
It should work either way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a quick comment on line 151 noting the difference between evaluate on line 152 and evaluate on line 137?
For future changes to this code, it will be important to note that we're also testing the list/int capability of this argument
assert result_df.shape == (7, 4) | ||
assert result_df.percentile[0] == 1.0 | ||
assert result.shape == (7, 4) | ||
assert result.enrichment_percentile[0] == 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert that result.enrichment_percentile[1]
equals what you'd calculate by hand.
@@ -42,11 +42,11 @@ def test_precision_recall(): | |||
result = precision_recall( | |||
similarity_melted_df=similarity_melted_df, | |||
replicate_groups=replicate_groups, | |||
k=10, | |||
k=[5, 10], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also keep a test for k as an int
?
This test is good, but maybe add a test asserting that the outputs are the same for k=10
and k=[10]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Co-authored-by: Greg Way <[email protected]>
@gwaygenomics ready for you to look at again :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor comments, I'll merge after they are addressed!
@@ -160,7 +156,7 @@ def test_evaluate_precision_recall(): | |||
replicate_groups=["Metadata_broad_sample"], | |||
operation="precision_recall", | |||
similarity_metric="pearson", | |||
precision_recall_k=k, | |||
precision_recall_k=[k], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a quick comment on line 151 noting the difference between evaluate on line 152 and evaluate on line 137?
For future changes to this code, it will be important to note that we're also testing the list/int capability of this argument
Co-authored-by: Greg Way <[email protected]>
Co-authored-by: Greg Way <[email protected]>
@gwaygenomics back to you again :) |
What is this change request, how do I get rid of it? |
No worries, I get rid of it when I finally approve. Looking now |
Wonderfully done - merging now |
Enrichment and Precision recall shall intake a list of variables. This way the similarity matrix is only computed once!