-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
- Start of ToxicRmod commit #211
base: main
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
library("devtools") | ||
#requires dependency Cmake | ||
#devtools::install_github("NIEHS/ToxicR") | ||
library("ToxicR") | ||
library("dplyr") | ||
library("PMCMRplus") | ||
library(parallel) | ||
library("pbmcapply") | ||
#bmdoutput <- read.table("~/storage/biomarker_input_.txt", | ||
# header=FALSE, sep = "\t", row.names = 1) | ||
bmdoutput <- read.table("~/storage/ToxicR mod/data for BMDExpress log2 CPM Males.txt", | ||
header=FALSE, sep = "\t", row.names = 1) | ||
# | ||
#bmdoutput <- bmdoutput[-3,-c(1:3,5:9,11:15,17:18)] | ||
# | ||
bmdoutput <- as.data.frame(t(bmdoutput)) | ||
bmdoutput <-bmdoutput[,-1] | ||
bmdoutput[-1] <- lapply(bmdoutput[-1], function(x) { | ||
as.numeric(as.character(x)) | ||
}) | ||
|
||
dose_column <-as.numeric(bmdoutput[,1]) | ||
start.time <- Sys.time() | ||
Func <- function(resp_bygene_column){ | ||
trendtestres<- williamsTest(resp_bygene_column ~ dose_column, alternative = c("greater", "less")) | ||
trendestresdf <- data_frame("pass/fail value" = trendtestres$crit.value[,1] - trendtestres$statistic[,1]) | ||
#only generate model when significant based on williams test | ||
if (any(trendestresdf$`pass/fail value` <0)){ | ||
model<-single_continuous_fit(dose_column,resp_bygene_column) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't look like you're specifying any model; I guess the default is Hill? The ideal way to approach this would be to wrap the whole function so that it can iterate over model types (ideally also in parallel) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes the default was hill, I have implemented modelling of all 5 available models, and to save the best one based on The maximum value of the likelihod/posterior, I read that this is an assessment of the goodness of fit and a higher maximum would be considered the best model, please correct me if I am wrong. I have also implemented this in parallel processing (thank you for teaching me about it, it has changed my world), I used nested mclapplys but was worried about how it would be affected given the cores distributed to the workers (maybe there is an optimal split?), I also used foreach dopar, based on this discussion: https://stackoverflow.com/questions/51707443/should-mclapply-calls-be-nested. Indeed after testing, foreach dopar is slightly faster than nested mclapply with maximum cores used for both. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. output file also now contains the gene the bmd and the model that was the best used |
||
#capture p-value output from ToxicR for goodness of fit | ||
modeloutput<- capture.output(summary(model)) | ||
all_numbers_mean_adequate_pvalue <- lapply(modeloutput[9:10], function(line) { | ||
as.numeric(tail(regmatches(line, gregexpr("\\b\\d+\\.\\d+\\b", line))[[1]],n=1)) | ||
}) | ||
#only return model if goodness of fit is significant | ||
if(!any(as.data.frame(all_numbers_mean_adequate_pvalue) >0.05)){ | ||
return(as.numeric(model$bmd[1])) | ||
} | ||
} | ||
} | ||
|
||
result <- pbmclapply(bmdoutput[,2:ncol(bmdoutput)], function(col) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can speed this up much more by passing mc.cores (usually a good idea to use num CPUs - 1 - see some examples and discussion here: https://stackoverflow.com/questions/28954991/whether-to-use-the-detectcores-function-in-r-to-specify-the-number-of-cores-for) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my apologies I thought I read that it automatically used the maximum amount of cores but you are correct including detectcores greatly improves speed! |
||
return(Func(col)) | ||
}) | ||
|
||
#structuring of results to remove NULLs and have rows for output | ||
signficant_BMD_table <- t(as.data.frame(Filter(Negate(is.null), result))) | ||
end.time <- Sys.time() | ||
time.taken <- round(end.time - start.time,2) | ||
time.taken | ||
write.table(signficant_BMD_table, file ="~/storage/toxicR output.txt", sep=" ", col.names = TRUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@raiduhs pointed out a downside to this, we can't set alpha with this implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
He also gave this a test at his end and found that it might need to be run for "greater" and "less" as separate tests, can you confirm whether you get both alternative hypotheses output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the code to test both greater and less, I am also check the ToxicR williamtrendtest and unfortunately it also does not allow for choosing an alpha value, I think we might be revamping the williamtrendtest in general anyways based on the BMDexpress source files I have sent you