Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple columns as replicate indicators #28

Open
gwaybio opened this issue Jan 7, 2021 · 2 comments
Open

Support multiple columns as replicate indicators #28

gwaybio opened this issue Jan 7, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@gwaybio
Copy link
Member

gwaybio commented Jan 7, 2021

In grit() and mp_value() specifically, we can add support for a list of columns indicating replicates vs. just a single string (so one column)

@gwaybio gwaybio added the enhancement New feature or request label Jan 7, 2021
@gwaybio gwaybio mentioned this issue Jan 7, 2021
@gwaybio
Copy link
Member Author

gwaybio commented Jan 25, 2021

can also do for group_id

@gwaybio
Copy link
Member Author

gwaybio commented Feb 16, 2021

I decided today not to pursue multiple column support for group_id. The difficulty arises when the time comes to define control perturbations. The way we currently formulate grit, is based on pairwise correlations between the target profiles and all other profiles. If we add multiple columns to group_id, we will also need to specify a column hierarchy of which group should be ignored when determining the control (reference).

In other words, specifying multiple groups would require us to specify which group should be ignored when specifying controls. For example, if I calculate grit on two plates of CRISPR profiles using "target gene" and "cell line" as the group_id, I only want to use the pairwise correlations to controls within cell line. Adding multiple groups would complicate things substantially. The current approach calling evaluate twice with only one group is our preferred method in version 0.1. Calling it twice per unique group also currently reduces the amount of unnecessary pairwise correlation calculations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant