You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using release version 2.4.2
I've been using rnaseqc for a while on aligment files where I kept secondary alignments and also the unmapped reads. I naively assumed that "Total Reads" metric means what the words do: the total number of reads found in the input data (uniquely counted, of course). If the user did not keep the unaligned reads in the alignment files, like I do, I expect the number there to reflect the total number of aligned reads found in the input alignments.
Only today I noticed the large discrepancy between that value and the reads reported in the HISAT2 alignment summary, so I suppose Total Reads means in fact Total read alignments for rnaseqc. And since in my case I kept the secondary alignments, reads can have multiple mappings, that's why the number I see there is inflated.
OK, my mistake for not checking on such a basic metric before. So now I am looking into the metrics.tsv file for any way to get the total number of reads - but I am not able to find it. I see "End 1 Bases" and "End 2 Bases" and "End 1 Mapped Reads" etc. but not "End 1 Reads" (I cannot use End 1/2 Bases and divide by read length to infer the number of reads because the reads have various lengths due to trimming.
The only number that seems to match the total number of reads is surprisingly found in this metric: Unique Mapping, Vendor QC Passed Reads
But that label is then incorrect because many of the reads are not "uniquely mapping" - and surely did not expect that metric to be the only place to get the number of input reads.
Is there any way to just get the number of reads (not read alignments) in the input alignment data?
The text was updated successfully, but these errors were encountered:
Using release version 2.4.2
I've been using rnaseqc for a while on aligment files where I kept secondary alignments and also the unmapped reads. I naively assumed that "Total Reads" metric means what the words do: the total number of reads found in the input data (uniquely counted, of course). If the user did not keep the unaligned reads in the alignment files, like I do, I expect the number there to reflect the total number of aligned reads found in the input alignments.
Only today I noticed the large discrepancy between that value and the reads reported in the HISAT2 alignment summary, so I suppose
Total Reads
means in factTotal read alignments
for rnaseqc. And since in my case I kept the secondary alignments, reads can have multiple mappings, that's why the number I see there is inflated.OK, my mistake for not checking on such a basic metric before. So now I am looking into the
metrics.tsv
file for any way to get the total number of reads - but I am not able to find it. I see "End 1 Bases" and "End 2 Bases" and "End 1 Mapped Reads" etc. but not "End 1 Reads" (I cannot use End 1/2 Bases and divide by read length to infer the number of reads because the reads have various lengths due to trimming.The only number that seems to match the total number of reads is surprisingly found in this metric:
Unique Mapping, Vendor QC Passed Reads
But that label is then incorrect because many of the reads are not "uniquely mapping" - and surely did not expect that metric to be the only place to get the number of input reads.
Is there any way to just get the number of reads (not read alignments) in the input alignment data?
The text was updated successfully, but these errors were encountered: