Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor workflow cbioportal_export #129

Open
8 tasks
eudesbarbosa opened this issue Apr 8, 2022 · 2 comments
Open
8 tasks

Refactor workflow cbioportal_export #129

eudesbarbosa opened this issue Apr 8, 2022 · 2 comments

Comments

@eudesbarbosa
Copy link
Member

eudesbarbosa commented Apr 8, 2022

Suggested changes

  • All class names should use CapWords convention.

  • Method cbioportalExportStepPart.iterate_over_biomedsheets should be broken up to follow single-responsibility principle:

    • iterate_over_biomedsheets(): Responsible for yielding biomedsheets contents
    • cbioportalCnaFilesStepPart._get_input_file_gistic(): Responsible for input files for action 'gistic'
    • cbioportalCnaFilesStepPart._get_input_file_log2(): Responsible for input files for action 'log2'
    • cbioportalCnaFilesStepPart._get_input_file_segments(): Responsible for input files for action 'segments'
    • cbioportalZscoresStepPart._get_input_file_zscores_input(): Responsible for input files for action 'zscores_input'

Alternatively, iterate_over_biomedsheets could be run once in the workflow constructor, the information could be stored in a dictionary for instance. Examples from VariantExportWorkflow._build_ngs_library_to_kit(): code

  • In cbioportalExportStepPart.iterate_over_biomedsheets, both {mapper} and {tools} should be extracted from config instead of being hardcoded.
  • Targeted CNV vs. WGS CNV relation should be made explicit. Currently, it is defined by tool listed in config. The step will mangle or omit CNV results if there is WES & WGS data in the project.
        if self.config["path_copy_number_step"]:
            if self.config["cnv_tool"] in ["cnvetti_on_target_postprocess", "copywriter"]:
                self.register_sub_workflow(
                    "somatic_targeted_seq_cnv_calling",
                    workdir=self.config["path_copy_number_step"],
                    sub_workflow_name="copy_number_step",
                )
            else:
                self.register_sub_workflow(
                    "somatic_wgs_cnv_calling",
                    workdir=self.config["path_copy_number_step"],
                    sub_workflow_name="copy_number_step",
                )
  • Snakemake file should be modified so all rules get their output and input from python part.
  • All rules in Snakemake should have a log file associated.
  • Merge rules with similar outputs (e.g., cbioportal_export_CNA_log2 and cbioportal_export_CNA_calls.
  • Consider merging rules that are simple shell calls (simple post/preprocessing?) - e.g.: cbioportal_export_concatenate_maf.
@eudesbarbosa
Copy link
Member Author

@ericblanc20 and @messersc, for future references - this issue.

@eudesbarbosa
Copy link
Member Author

Extracted from #156 :

Based on commit c469173

  • Update cbioportal_export default config.
  • Fix wrapper snappy_wrappers/wrappers/cbioportal/case_lists/wrapper.py..
  • Fix wrapper snappy_wrappers/wrappers/cbioportal/meta_files/wrapper.py.
  • Fix wrapper snappy_wrappers/wrappers/vcf2maf/vcf2maf/wrapper.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant