Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Analysis Example: Finding Co expressed gene modules with CoGaps #348

Open
cansavvy opened this issue Nov 6, 2020 · 0 comments
Open
Labels

Comments

@cansavvy
Copy link
Contributor

cansavvy commented Nov 6, 2020

What are the goals of this new example analysis?

People often want to find genes that are coexpressed together. WGCNA is often the default method people use and while this may be fine for some use cases, CoGaps is a slightly more sophisticated, albeit computing intensive method for similar questions. They also appear to have very nicely made vignettes and documentation -- a quality we look for in tools that we recommend to users and trainees.

What kind of dataset will this need?

CoGaps looks for latent spaces using Non negative matrix factorization. This means we want a dataset that is big enough (has enough genes and enough samples) to run this on, but not such a large dataset that this won't be able to run locally. This may take some trials.
The CoGaps example in the vignette has 9 samples and 1363 genes, so probably something at least that big and probably bigger is better.

What steps should be included in this analysis?

We can sort through the CoGaps vignette and determine what steps we find most useful after running the main CoGaps function. These aren't hard fast steps because I haven't run this myself yet, but these are more items that we should explore in these steps

  • This function has some parallel computing options that we will want to use so we try to run CoGaps in a timely manner that users can still do locally.

  • It's unclear to me at this point, but their vignette seems to suggest some fiddling with parameters may be needed or at least should be explored (another reason this should be an "advanced topic") so we should give some guidance about how to exploree and choose parameters.

  • We usually like to leave our users with some nice visuals. The CoGaps vignette has one plot on there but we may want to think about another more pub-ready visual that users might like to see (a heatmap or something better than that).

What packages/methods do you recommend using or looking into for this analysis?

CoGaps is installed from bioconductor. I haven't ran the full thing, but their documentation does warn it takes a good amount of computing time because of the non negative matrix factorization involved. We may need to have a RAM requirements warning/suggestion for users on this example -- another reason for it to be in the "advanced topics" section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants