Skip to content

ealyn/gem-workflow-demo

 
 

Repository files navigation

This workflow implements the GEM (Gene-Environment interaction analysis for Millions of samples) tool (https://github.com/large-scale-gxe-methods/GEM). GEM conducts genome-wide gene-environment interaction tests in unrelated individuals while allowing for multiple exposures, control for genotype-covariante interactions, and robust inference.

Author: Kenny Westerman ([email protected])

GEM tool information:

Workflow steps:

  • Run GEM (scattered across an array of input files, usually chromosomes)
  • Concatenate the outputs into a single summary statistics file

Inputs:

See the "parameter_meta" section of the .wdl script.

Outputs:

  • A summary statistics file containing estimates for genetic main effects and interaction effects as well as p-values for these along with a joint test of genetic main and interaction effects.

Cost estimation:

  • Example analysis for reference
    • Platform: Terra
    • Dataset: UK Biobank (N ~ 350k)
    • Analysis parameters: binary phenotype, single interaction term, 6 covariates
    • Computational resources requested: 4 CPUs, 10GB memory, 250GB disk
  • The above analysis completed for chromosome 2 (~1M variants) in about 9 hours, which translates to a cost of approximately $2 based on typical Terra costs assuming non-preemptible machines are used.
  • To extrapolate from the above estimates: Runtime and cost should scale linearly with the sample size and number of variants and will be inversely proportional to the number of cores used.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • WDL 80.2%
  • Dockerfile 19.8%