You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If reading more than ~5% of the samples or genes using read_tome_[sample/gene]_data(), it's often faster to read the entire matrix with read_tome_dgCMatrix().
This may be due to having both open and close in read_tome_vector(). I think this can be optimized to either take one pass at read_tome_vector() or else keep the connection open for iteration over read_tome_vector().
The text was updated successfully, but these errors were encountered:
Pull request #19 should help a lot with read speed for subsets of samples or genes. @jeremymiller - could you test this some time using the Dev branch and let me know if this feels any better in actual use? devtools::install_github("AllenInstitute/scrattch.io", ref = "dev")
This is quite a bit faster than before (!!!). Thank you @hypercompetent . Taking 1000 genes (~2% of the genes) takes ~3 seconds, compared with ~30 seconds for all of the data.
tome = "\\\\allen/programs/celltypes/workgroups/rnaseqanalysis/shiny/tomes/facs/human_MTG_bioRxiv/MTG_all.tome"
system.time({
plotGenes <- read_tome_gene_names(tome)[1:1000]
plotData <- read_tome_gene_data(tome = tome, genes = plotGenes, regions = "exon", units = "counts")
dim(plotData)
})
system.time({
allData <- read_tome_dgCMatrix(tome,"/data/exon")
dim(allData)
})
Via Jeremy:
If reading more than ~5% of the samples or genes using read_tome_[sample/gene]_data(), it's often faster to read the entire matrix with read_tome_dgCMatrix().
This may be due to having both open and close in read_tome_vector(). I think this can be optimized to either take one pass at read_tome_vector() or else keep the connection open for iteration over read_tome_vector().
The text was updated successfully, but these errors were encountered: