"Total Reads" misleading, how to get the input read count? #85

gpertea · 2024-01-30T14:48:34Z

Using release version 2.4.2
I've been using rnaseqc for a while on aligment files where I kept secondary alignments and also the unmapped reads. I naively assumed that "Total Reads" metric means what the words do: the total number of reads found in the input data (uniquely counted, of course). If the user did not keep the unaligned reads in the alignment files, like I do, I expect the number there to reflect the total number of aligned reads found in the input alignments.

Only today I noticed the large discrepancy between that value and the reads reported in the HISAT2 alignment summary, so I suppose Total Reads means in fact Total read alignments for rnaseqc. And since in my case I kept the secondary alignments, reads can have multiple mappings, that's why the number I see there is inflated.

OK, my mistake for not checking on such a basic metric before. So now I am looking into the metrics.tsv file for any way to get the total number of reads - but I am not able to find it. I see "End 1 Bases" and "End 2 Bases" and "End 1 Mapped Reads" etc. but not "End 1 Reads" (I cannot use End 1/2 Bases and divide by read length to infer the number of reads because the reads have various lengths due to trimming.

The only number that seems to match the total number of reads is surprisingly found in this metric:
Unique Mapping, Vendor QC Passed Reads
But that label is then incorrect because many of the reads are not "uniquely mapping" - and surely did not expect that metric to be the only place to get the number of input reads.

Is there any way to just get the number of reads (not read alignments) in the input alignment data?

The text was updated successfully, but these errors were encountered:

francois-a · 2024-04-21T06:07:27Z

Hi, thanks for reporting this. The latest commit (a6b85ef) fixes this, returning both total alignments and total reads.

clariB · 2024-04-24T20:17:32Z

Can these changes be included in a downloadable static executable? I cloned the repository and the usage doesn't match what's in the readme.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Total Reads" misleading, how to get the input read count? #85

"Total Reads" misleading, how to get the input read count? #85

gpertea commented Jan 30, 2024

francois-a commented Apr 21, 2024

clariB commented Apr 24, 2024

"Total Reads" misleading, how to get the input read count? #85

"Total Reads" misleading, how to get the input read count? #85

Comments

gpertea commented Jan 30, 2024

francois-a commented Apr 21, 2024

clariB commented Apr 24, 2024