Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to parse variants from VCF files #6

Merged
merged 10 commits into from
Mar 15, 2024
Merged

Add functionality to parse variants from VCF files #6

merged 10 commits into from
Mar 15, 2024

Conversation

grst
Copy link
Contributor

@grst grst commented Mar 7, 2024

Porting changes originally made by @christopher-mohr.

* Initial commit

* Update DESCRIPTION

* bootstrap tests

* Test CI

* Prototype function to load gene expression data to SE

* Stub functions

* Add progress bar

* Add pre-commit config

* Add personalis functions to package and run roxygen

* Add relevant documentation on personalis outputs

* Update exploratory notebook

* Fix pre-commit prettier

* Format files according to pre-commit checks

* Draft function to read small variant data

* configure lintr

* Implement basic functionality to read variant and GEX data into MAE

* Implement basic error handling for missing samples

* Generalize warning for missing samplese

* Read CNV data

* Fix CNV IO function in case of empty tables

* Add function to read personalis HLA data

* Add function to read in TCR data

* Scrape TCR summary statistics from HTML

* Implement function to read somatic variant statistics

* Read summary stats for somatic variants

* Implement bumpy_matrix_to_df

* Read CNV summary statistics

* Read MSI info

* refactor

* Workaround for samples with no col in bumpy matrix

* Apply the fix also to small variant data

* Use "Genomic Variant" instead of pos as unique variant identifier

* Fix issue with reading non-somatic variants

* Handle case when there are no samples for a modality

* Fix duplicated mutation ids

* Fix column name incompatibility in newer HTML report versions

* stub vignette

* Add vignette

* Update vignette

* Ensure bumpy matrix, row and coldata have consistent order

* Fix alternative gex filename and CNV import

* Support alternative TCR path

* Fix column conversion in CNV reader

* Fix paths

* add function for parsing VCF files

* add functionality for reading and storing VCF data

* add/change comments

* add option to read small variant reports of type all

* Angewendeter Vorschlag

* Angewendeter Vorschlag

* add sample type check

* Angewendeter Vorschlag

* Angewendeter Vorschlag

* add report_type parameter

* Update README

* Fix reading CNV report

* Roxygenize

* Fix parse copy number report

---------

Co-authored-by: Christopher Mohr <[email protected]>
Co-authored-by: grst <[email protected]>
@grst grst marked this pull request as draft March 7, 2024 13:19
@grst grst marked this pull request as ready for review March 11, 2024 08:43
Copy link
Contributor

@christopher-mohr christopher-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@grst grst merged commit 8f73ffa into master Mar 15, 2024
5 checks passed
@grst grst deleted the vcf2 branch March 15, 2024 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants