Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-mapping reads filtering support in countNonSplicedReads #91

Open
mincej opened this issue Oct 5, 2024 · 0 comments
Open

Multi-mapping reads filtering support in countNonSplicedReads #91

mincej opened this issue Oct 5, 2024 · 0 comments

Comments

@mincej
Copy link

mincej commented Oct 5, 2024

In the documentation from FRASER countRNAData, there is the following parameters:

...  Further parameters passed on to Rsubread::featureCounts.

However, when running with the parameters:

reads_onerun <- countRNAData(
    sampleID ="sample1",
    fds = fds, 
    NcpuPerSample = 8,
    recount = TRUE,
    keepNonStandardChromosomes = TRUE,
    genome = c("BSgenome.Hsapiens.UCSC.hg38"),
    countMultiMappingReads = FALSE
)

an error is returned:

Error in getNonSplitReadCountsForAllSamples(fds = fds, splitCountRanges = splitCountRanges,  : 
  unused arguments (sampleID = "sample1", countMultiMappingReads = TRUE)

In the nonspliced read counts function, the countMultiMappingReads parameter is hardcoded to FALSE.

FRASER/R/countRNAseqData.R

Lines 904 to 925 in ee6f279

rsubreadCounts <- featureCounts(files=bamFile, annot.ext=anno,
minOverlap=minAnchor*2,
allowMultiOverlap=TRUE,
checkFragLength=FALSE,
minMQS=bamMapqFilter(scanBamParam(fds)),
strandSpecific=strand,
# activating long read mode
isLongRead=longRead,
# multi-mapping reads
countMultiMappingReads=TRUE,
# unstranded case: for counting only non spliced reads we
# skip this information
isPairedEnd=isPairedEnd,
# sorting only needed for paired-end reads
autosort=doAutosort,
nthreads=NcpuPerSample,
tmpDir=file.path(file_path_as_absolute(workingDir(fds)), "cache")
)

This is different than countSplicedReads, which uses GenomicAlignment functions for counting which can use ScanBamParams and tagFilters for filtering bam files.

FRASER/R/countRNAseqData.R

Lines 534 to 537 in ee6f279

countsList <- bplapply(chromosomes, FUN=countSplitReadsPerChromosome,
bamFile=bamfile, pairedEnd=pairedend, genome=genome,
strandMode=strandmode, scanBamParam=scanbamparam,
BPPARAM=getBPParam(NcpuPerSample, length(chromosomes)))

FRASER/R/countRNAseqData.R

Lines 567 to 569 in ee6f279

galignment <- readGAlignmentPairs(
bamFile, param=param, strandMode=strandMode)
}

FRASER/R/countRNAseqData.R

Lines 613 to 614 in ee6f279

jc <- summarizeJunctions(galignment, genome=genome,
with.revmap=(as.logical(strandMode) && pairedEnd) )

What is the reason that you cannot alter the multimapping count parameters for nonspliced reads, but can for spliced reads? Or am I misunderstanding how these functions work in tandem, and that the filtering for multimapping reads is done by extension when defining the spliced ranges resulting from countSplicedReads and needed as input for countNonSplicedReads?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant