Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grafpop&CHARR #7

Open
wants to merge 257 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
257 commits
Select commit Hold shift + click to select a range
614c43b
small edits to sex_infer.R updated logic
Nov 30, 2023
641316e
new Dockerfile as well as corresponding wdl
Dec 7, 2023
364ed71
increasing memory slightly on for ancestry inference
Dec 7, 2023
74401dc
Small edit to infer_sex.R to be more robust
Dec 8, 2023
206806a
another small change to infer_sex.R
Dec 8, 2023
4d64591
Add wdl to transfer files to GCS automatically
Jan 5, 2024
55c6129
adding for purposes of usability of grafpop results
Jan 12, 2024
9b28687
chmod a file
Jan 12, 2024
a5f0ca5
chmod a file update
Jan 12, 2024
2e3fdcc
fixed error in sex_infer.R
Jan 13, 2024
4ec367d
added some helper functions
Jan 23, 2024
330ed1e
added an automatic converter for json file formats
Jan 25, 2024
76c7553
added charr Dockerfile
Jan 29, 2024
a434597
edited charr Dockerfile
Jan 29, 2024
adbfb7c
edited charr dockerfile
Jan 30, 2024
856f54d
edited charr dockerfile
Jan 30, 2024
2974dae
adding echtvar
Jan 30, 2024
d281e5e
adding echtvar
Jan 30, 2024
143b383
updated json_converter.py; works now
Jan 30, 2024
a6c28dd
bcftools docker added
Jan 31, 2024
d31bcc6
bcftools docker edited
Jan 31, 2024
cc11688
bcftools docker edited
Jan 31, 2024
55b2b87
edit bcftool docker
Jan 31, 2024
bc0fdef
edit bcftool docker
Jan 31, 2024
fbd2083
edit bcftool docker
Jan 31, 2024
5a18d62
added file for echtvar
Feb 1, 2024
2ae5f87
added bcftools to charr docker
Feb 1, 2024
e64fdb6
edited charr docker
Feb 1, 2024
f711e24
edited charr docker
Feb 1, 2024
f1dbaec
edited charr docker
Feb 1, 2024
0d74930
edited charr docker
Feb 1, 2024
efbddc5
edited charr docker
Feb 1, 2024
fafa7fb
edited charr docker
Feb 1, 2024
b8040be
updated WGSFastqToCram wdl
Feb 5, 2024
ce949df
No more raw gvcfs being transferred
Feb 7, 2024
41d72fb
small edit to G2C transfer file wdl
Feb 7, 2024
10f4374
bcftools docker
Feb 16, 2024
63d19e6
added grafpop and charr wdls
Feb 26, 2024
d8c7b88
Delete scripts/gsi_helpers/data/gsc.tsv
noahfields1 Feb 26, 2024
3c794e9
Delete scripts/gsi_helpers/data/gsc_filtered.tsv
noahfields1 Feb 26, 2024
8ef5107
Delete scripts/gsi_helpers/data/gsc_filtered_test_genes.tsv
noahfields1 Feb 26, 2024
bfb1467
Delete scripts/gsi_helpers/data/.DS_Store
noahfields1 Feb 26, 2024
70d5500
Added analysis for GSC study
Mar 4, 2024
d322b1f
Adding case-control matching for AoU UFC study
Mar 7, 2024
6d69b37
update to matching algorithm
Mar 7, 2024
74e5dc2
Added code to consolidate data
Mar 11, 2024
a28ba23
added some new WDLs
Mar 21, 2024
bd0420f
small adjustments to aou_charr.wdl
Mar 21, 2024
4b0efda
added a VEP_hg19 version that is bad practice
Mar 25, 2024
03a36db
added edit to bad practice Vep hg19 wdl
Mar 25, 2024
31cfea1
editing hg19 Vep
Mar 25, 2024
cd6e480
editing hg19 Vep
Mar 25, 2024
1a27fa8
editing hg19 Vep
Mar 25, 2024
a47345f
editing hg19 Vep
Mar 25, 2024
5111e82
editing hg19 Vep
Mar 25, 2024
4130de2
adding new hg19 Vep
Mar 25, 2024
46efdd0
adding new hg19 Vep
Mar 25, 2024
38b889e
adding new hg19 Vep
Mar 25, 2024
bf52140
edit hg19 Vep
Mar 25, 2024
d68210b
edit hg19 Vep
Mar 25, 2024
5fb609b
edit hg19 Vep
Mar 26, 2024
764e230
edit VEP_hg19
Mar 26, 2024
7ab4fe7
edit VEP_hg19
Mar 26, 2024
e8d2208
edit VEP_hg19
Mar 26, 2024
d3a1678
edit VEP_hg19
Mar 26, 2024
10add93
edit VEP_hg19
Mar 26, 2024
26eb520
edit VEP_hg19
Mar 26, 2024
e44020a
edit VEP_hg19.wdl
Mar 27, 2024
119b00f
edit VEP_hg19.wdl
Mar 27, 2024
c3bda48
edit VEP_hg19.wdl
Mar 27, 2024
a066386
edit VEP_hg19.wdl
Mar 28, 2024
74702f0
edit VEP_hg19.wdl
Mar 28, 2024
35f1da7
edit VEP_hg19.wdl
Mar 28, 2024
3ec9561
edit VEP_hg19.wdl
Mar 28, 2024
8ef2e58
edit VEP_hg19.wdl
Mar 28, 2024
2d9b8dc
edit VEP_hg19.wdl
Mar 28, 2024
124058b
edit VEP_hg19.wdl
Mar 28, 2024
62e3a04
edit VEP_hg19.wdl
Mar 28, 2024
f153557
edit VEP_hg19.wdl
Mar 28, 2024
315542c
edit VEP_hg19.wdl
Mar 28, 2024
19ce81c
edit VEP_hg19.wdl
Mar 28, 2024
9137630
edit VEP_hg19.wdl
Mar 29, 2024
c9ed99b
edit VEP_hg19.wdl
Mar 29, 2024
c97d62b
added method to annotate gsc tsv
Apr 3, 2024
5e77c63
add some wdls
Apr 10, 2024
40237eb
edited wdls
Apr 10, 2024
31b6d50
edited wdls
Apr 10, 2024
de7aa04
edited wdls
Apr 10, 2024
0635cca
edited wdls
Apr 10, 2024
2d99f8f
edited wdls
Apr 10, 2024
b356f3f
edited wdls
Apr 10, 2024
e86a53b
edited wdls
Apr 10, 2024
85e0378
edited wdls
Apr 10, 2024
7830388
edited wdls
Apr 11, 2024
411d844
edited wdls
Apr 11, 2024
3aad4f2
edited wdls
Apr 11, 2024
96799dd
edited wdls
Apr 11, 2024
01dc5e1
edited wdls
Apr 11, 2024
5e7ef66
edited GSC_CC wdl
Apr 16, 2024
6ca54be
edited GSC_CC wdl
Apr 16, 2024
ad3eb60
new edit to cohort_data_qc wdl
Apr 19, 2024
66bd922
updated germline somatic convergence
Apr 23, 2024
7a58aa0
updated cohort metrics wdl
Apr 23, 2024
221abeb
added new wdls and changed some names
May 1, 2024
c26bb8c
edited nc somatic wdl
May 1, 2024
873bb49
edited nc somatic wdl
May 1, 2024
616921c
edited nc somatic wdl
May 1, 2024
e35f904
edited nc somatic wdl
May 1, 2024
30b02c6
edited nc somatic wdl
May 1, 2024
078a44d
small edit to somatic noncoding gsc
May 3, 2024
a74c6ac
updated wdls
May 13, 2024
fc94d9a
updated wdls
May 13, 2024
da2447a
generate sample map wdl
May 17, 2024
ed323ab
changed capitalization error in Coding_Convergence.gsc.wdl
Jul 24, 2024
1b3a48b
small syntax error
Jul 24, 2024
4658f1c
small edit to gsc
Jul 26, 2024
8125c14
small edit to gsc
Jul 26, 2024
35fb318
fixed convergence
Jul 26, 2024
d60c4ca
added hail_rf docker
Jul 29, 2024
03e2ffc
edited hail_rf docker
Jul 29, 2024
1fa0fd0
edited hail_rf docker
Jul 29, 2024
992073e
change file name
Jul 30, 2024
66a6667
added pydata stack docker
Aug 7, 2024
0ba757d
added pydata stack docker
Aug 7, 2024
905e579
added pydata stack docker
Aug 7, 2024
6cc2f76
new docker container: gsc_pytools
Sep 25, 2024
8ce64d8
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
64811e5
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
870c3f1
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
1483577
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
b4c95b4
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
4ebcec4
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
719e697
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
d8c53a7
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
ea54965
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
49f42b2
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
a5d38b1
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
8aab8b3
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
3682bcc
new docker container: gsc_pytools
fieldsnoa Sep 25, 2024
fd26a8e
added statsmodels to docker container
Sep 25, 2024
01adca0
small fix
Sep 26, 2024
7cab128
small fix
Sep 26, 2024
c023ac3
small change in mutation freq calc
fieldsnoa Sep 26, 2024
7e08d84
trying to add firthlogist
fieldsnoa Sep 26, 2024
8447387
switching to python3.9 for firthlogist
fieldsnoa Sep 26, 2024
460b216
switching to python3.9 for firthlogist
fieldsnoa Sep 26, 2024
4291a4a
switching to python3.9 for firthlogist
fieldsnoa Sep 26, 2024
3e0102b
switching to python3.9 for firthlogist
fieldsnoa Sep 26, 2024
aa84eb9
small edit to fishers exact logic
fieldsnoa Sep 26, 2024
6112a40
including firth logistic regression
fieldsnoa Sep 26, 2024
5cb7e4a
including firth logistic regression
fieldsnoa Sep 26, 2024
c896be6
fixing gsc_pytools docker
fieldsnoa Sep 26, 2024
7454381
fixing gsc_pytools docker
fieldsnoa Sep 26, 2024
fce8c69
fixing gsc_pytools docker
fieldsnoa Sep 26, 2024
ab5c8d0
fixing gsc_pytools docker
fieldsnoa Sep 26, 2024
0e3940d
fixing gsc_pytools docker
fieldsnoa Sep 26, 2024
4ca5e79
overhaul to firthLogist
fieldsnoa Sep 26, 2024
d5bf7b3
overhaul to firthLogist
fieldsnoa Sep 26, 2024
4800b65
trying to get firthLogist to converge
fieldsnoa Sep 26, 2024
8417a9c
adjustment to firth logistic regression parameter options
fieldsnoa Sep 27, 2024
3fb6d6e
small edit to gsc_util.py
fieldsnoa Sep 27, 2024
c4a89e4
small edit in the case snp is not in df
Sep 27, 2024
5281178
small capitalization error
Sep 27, 2024
7a6e527
small python edit
Sep 28, 2024
643ac4d
revert to earlier
Sep 28, 2024
5c920e0
added functionality for meta anlysis
fieldsnoa Sep 30, 2024
ad58d0e
added functionality for meta anlysis
fieldsnoa Sep 30, 2024
d0f64df
added functionality for meta anlysis
fieldsnoa Sep 30, 2024
8f23e49
added functionality for meta anlysis
fieldsnoa Sep 30, 2024
cfc9cf4
added functionality for meta anlysis
fieldsnoa Sep 30, 2024
4e2d825
added some graphing functions
fieldsnoa Sep 30, 2024
9bc955d
graphing fixes
Sep 30, 2024
199c48f
small edits
Oct 1, 2024
ecf6005
small edits
Oct 1, 2024
f713bf1
edit to graphing
fieldsnoa Oct 1, 2024
a6a7e00
small graph edit
fieldsnoa Oct 1, 2024
7fd78c6
edit to meta analysis p_val calc
fieldsnoa Oct 2, 2024
0b82753
edit to p_val calc
fieldsnoa Oct 2, 2024
0cb80ad
implemented FDR for graphing
fieldsnoa Oct 3, 2024
53d979d
edits to allele freq calc
fieldsnoa Oct 3, 2024
be9f91d
fixing AF
fieldsnoa Oct 3, 2024
25099a9
small bug
fieldsnoa Oct 3, 2024
17c3ac9
fixed FDR problem
fieldsnoa Oct 4, 2024
dc2eb57
firth fallback implemented
fieldsnoa Oct 15, 2024
f6fc25a
firth fallback implemented
fieldsnoa Oct 15, 2024
45baa9e
firth fallback implemented
fieldsnoa Oct 15, 2024
b9ad12d
firth fallback implemented
fieldsnoa Oct 15, 2024
f82cfdf
firth fallback implemented
fieldsnoa Oct 15, 2024
1882860
firth fallback implemented
fieldsnoa Oct 15, 2024
866c57e
firth fallback implemented
fieldsnoa Oct 15, 2024
eac195e
firth fallback implemented
fieldsnoa Oct 15, 2024
27c58eb
added seaborn to pydata_stack
fieldsnoa Oct 16, 2024
451a67b
new ufc docker
fieldsnoa Oct 16, 2024
e885fd4
small error in g2c_ufc
fieldsnoa Oct 16, 2024
e7eb33a
fixed firth fallback
fieldsnoa Oct 17, 2024
7d3ac89
edit to firth fallback
fieldsnoa Oct 17, 2024
c9879b5
updated for all gsc contexts
fieldsnoa Oct 18, 2024
11d7f28
small update
fieldsnoa Oct 18, 2024
f92b21d
small update
fieldsnoa Oct 18, 2024
f63936d
small update
fieldsnoa Oct 18, 2024
2e991b4
simplified analysis in WDL
fieldsnoa Oct 23, 2024
35f37cd
simplified analysis in WDL
fieldsnoa Oct 23, 2024
0f96f05
simplified analysis in WDL
fieldsnoa Oct 23, 2024
627cee8
simplified analysis in WDL
fieldsnoa Oct 23, 2024
7274219
simplified analysis in WDL
fieldsnoa Oct 23, 2024
5c4f38b
simplified analysis in WDL
fieldsnoa Oct 23, 2024
2ca8ed0
simplified analysis in WDL
fieldsnoa Oct 24, 2024
b9288e8
simplified analysis in WDL
fieldsnoa Oct 24, 2024
ae075b9
simplified analysis in WDL
fieldsnoa Oct 25, 2024
c0b6499
added new covariates for HMF
fieldsnoa Oct 29, 2024
bef5f26
added primary 1 hot variate for HMF
fieldsnoa Oct 29, 2024
ffe0a64
new covariates for PROFILE log reg analysis
fieldsnoa Nov 6, 2024
f0aeece
edited graph function
fieldsnoa Nov 6, 2024
5ea735e
small edits to graph making
fieldsnoa Nov 13, 2024
56ddb9b
small edits to graph making
fieldsnoa Nov 14, 2024
21e9d35
added GWAS miniplots
fieldsnoa Nov 15, 2024
54439be
fixed plotting error
fieldsnoa Nov 15, 2024
2d6283c
fixed plotting errors. and missing kidney data
fieldsnoa Nov 18, 2024
9aa642f
fixed plotting errors.
fieldsnoa Nov 19, 2024
61632b0
fixed plotting errors. i think
fieldsnoa Nov 19, 2024
13437ad
fixed some plotting errors. Legend still needs love.
fieldsnoa Nov 20, 2024
d4cf0d8
Added in cancer_type as covariate for pancancer log-reg model. Volcan…
fieldsnoa Nov 20, 2024
8af997b
Added in profile germline coding. Volcano Plot Legend still needs love.
fieldsnoa Nov 20, 2024
d3fa8e6
Added in profile germline coding. Volcano Plot Legend still needs love.
fieldsnoa Nov 21, 2024
cccd668
Profile coding coding interaction stats printed out. Needs some qc ap…
fieldsnoa Nov 21, 2024
a9ecfff
Altered way we call germline coding variants
fieldsnoa Dec 2, 2024
d9816b2
Altered way we call germline coding variants
fieldsnoa Dec 2, 2024
b6b4bf3
Altered way we call germline coding variants
fieldsnoa Dec 2, 2024
f97c0df
Altered way we call germline coding variants
fieldsnoa Dec 2, 2024
07833db
added qqman package to g2c_ufc
fieldsnoa Dec 6, 2024
4c70373
added qqman
fieldsnoa Dec 6, 2024
cebaeed
added qqman
fieldsnoa Dec 6, 2024
82a163e
added qqman
fieldsnoa Dec 6, 2024
e4d5ed5
added qqman
fieldsnoa Dec 6, 2024
6b1af03
added qqman
fieldsnoa Dec 6, 2024
96a971b
added qqman
fieldsnoa Dec 6, 2024
6dd3bc7
added qqman
fieldsnoa Dec 6, 2024
97da7d6
added qqman
fieldsnoa Dec 6, 2024
377e0fc
added qqman
fieldsnoa Dec 6, 2024
ef9c645
added qqman
fieldsnoa Dec 6, 2024
69df487
retrying qqman docker
fieldsnoa Dec 10, 2024
cfe0192
retrying qqman docker
fieldsnoa Dec 10, 2024
83dc478
retrying qqman docker
fieldsnoa Dec 10, 2024
f9f58f8
retrying qqman docker
fieldsnoa Dec 10, 2024
a4a0553
retrying qqman docker
fieldsnoa Dec 10, 2024
a317285
retrying qqman docker
fieldsnoa Dec 11, 2024
c02ce0b
retrying qqman docker
fieldsnoa Dec 11, 2024
77c63f5
retrying qqman docker
fieldsnoa Dec 11, 2024
925eed1
edit to extracting germline variants
fieldsnoa Dec 13, 2024
6bc5a89
small edit in extract germline variant task
fieldsnoa Dec 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
firth fallback implemented
  • Loading branch information
fieldsnoa committed Oct 15, 2024
commit 866c57e979eeaee7086c2ea58bd5a57501c70125
24 changes: 15 additions & 9 deletions docker/gsc_pytools/gsc_util.py
Original file line number Diff line number Diff line change
@@ -50,12 +50,18 @@ def logistic_regression_with_fallback(df, cancer_type, germline_event, somatic_g
print("Regular logistic regression converged successfully!")

# Extract p-value and odds ratio for the germline_gene predictor
p_value = result.pvalues[1] # Assuming germline_event is the first predictor after the constant
odds_ratio = np.exp(result.params[1])
p_value = result.pvalues[germline_event] # Assuming germline_event is the first predictor after the constant
odds_ratio = np.exp(result.params[germline_event])

# Confidence intervals
conf_int = result.conf_int()
conf_int_or = np.exp(conf_int.iloc[1]) # Convert log-odds CI to odds ratio CI
# Calculate the 95% confidence interval for the odds ratio
coef = result.params[germline_gene]
std_err = result.bse[germline_gene]

# The confidence interval for the coefficient
conf_int_coef = [coef - 1.96 * std_err, coef + 1.96 * std_err]

# Convert the confidence interval from log odds to odds ratio
conf_int_or = np.exp(conf_int_coef)

# Return the results: odds ratio, p-value, and confidence intervals
return odds_ratio, p_value, conf_int_or[0], conf_int_or[1]
@@ -74,12 +80,12 @@ def logistic_regression_with_fallback(df, cancer_type, germline_event, somatic_g
model.fit(X, y)

# Extract p-value and odds ratio for the germline_gene predictor
p_value = model.pvals_[1] # Assuming the predictor is the second column after the constant
odds_ratio = np.exp(model.coef_[1])
p_value = model.pvals_[germline_gene] # Assuming the predictor is the second column after the constant
odds_ratio = np.exp(model.coef_[germline_gene])

# Calculate confidence intervals
coef = model.coef_[1]
std_err = model.bse_[1]
coef = model.coef_[germline_gene]
std_err = model.bse_[germline_gene]
conf_int_coef = [coef - 1.96 * std_err, coef + 1.96 * std_err]
conf_int_or = np.exp(conf_int_coef)