Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grit should handle multiple groupings #12

Open
gwaybio opened this issue Jan 25, 2021 · 0 comments
Open

Grit should handle multiple groupings #12

gwaybio opened this issue Jan 25, 2021 · 0 comments
Labels
enhancement New feature or request

Comments

@gwaybio
Copy link
Member

gwaybio commented Jan 25, 2021

The following error indicates that grit should be calculated per perturbation. Cytominer eval should be aware of the "group_id" structure and enable multiple groups (as opposed to only a single group allowed now).

With the single group option, we need to calculate grit for each group independently.

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-6-61c106f993e9> in <module>
      6 }
      7 
----> 8 grit_results_df = evaluate(
      9     profiles=df,
     10     features=features,

/usr/local/lib/python3.9/site-packages/cytominer_eval/evaluate.py in evaluate(profiles, features, meta_features, replicate_groups, operation, similarity_metric, percent_strong_quantile, precision_recall_k, grit_control_perts, mp_value_params)
     51         )
     52     elif operation == "grit":
---> 53         metric_result = grit(
     54             similarity_melted_df=similarity_melted_df,
     55             control_perts=grit_control_perts,

/usr/local/lib/python3.9/site-packages/cytominer_eval/operations/grit.py in grit(similarity_melted_df, control_perts, replicate_id, group_id)
     59     # Calculate grit for each perturbation
     60     grit_df = (
---> 61         similarity_melted_df.groupby(replicate_col_name)
     62         .apply(lambda x: calculate_grit(x, control_perts, column_id_info))
     63         .reset_index(drop=True)

/usr/local/lib/python3.9/site-packages/pandas/core/groupby/groupby.py in apply(self, func, *args, **kwargs)
    892         with option_context("mode.chained_assignment", None):
    893             try:
--> 894                 result = self._python_apply_general(f, self._selected_obj)
    895             except TypeError:
    896                 # gh-20949

/usr/local/lib/python3.9/site-packages/pandas/core/groupby/groupby.py in _python_apply_general(self, f, data)
    926             data after applying f
    927         """
--> 928         keys, values, mutated = self.grouper.apply(f, data, self.axis)
    929 
    930         return self._wrap_applied_output(

/usr/local/lib/python3.9/site-packages/pandas/core/groupby/ops.py in apply(self, f, data, axis)
    236             # group might be modified
    237             group_axes = group.axes
--> 238             res = f(group)
    239             if not _is_indexed_like(res, group_axes, axis):
    240                 mutated = True

/usr/local/lib/python3.9/site-packages/cytominer_eval/operations/grit.py in <lambda>(x)
     60     grit_df = (
     61         similarity_melted_df.groupby(replicate_col_name)
---> 62         .apply(lambda x: calculate_grit(x, control_perts, column_id_info))
     63         .reset_index(drop=True)
     64     )

/usr/local/lib/python3.9/site-packages/cytominer_eval/operations/util.py in calculate_grit(replicate_group_df, control_perts, column_id_info)
     94     Usage: Designed to be called within a pandas.DataFrame().groupby().apply()
     95     """
---> 96     group_entry = get_grit_entry(replicate_group_df, column_id_info["group"]["id"])
     97     pert = get_grit_entry(replicate_group_df, column_id_info["replicate"]["id"])
     98 

/usr/local/lib/python3.9/site-packages/cytominer_eval/operations/util.py in get_grit_entry(df, col)
    135 def get_grit_entry(df: pd.DataFrame, col: str) -> str:
    136     entries = df.loc[:, col]
--> 137     assert (
    138         len(entries.unique()) == 1
    139     ), "grit is calculated for each perturbation independently"

AssertionError: grit is calculated for each perturbation independently
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant