Extremely high percentage of reads too short to map... #28

GoogleCodeExporter · 2016-01-26T05:24:55Z

I have been using STAR with Illumina 101bp paired end reads. The first set of 
libraries I sequenced work great going through the pipeline, but I have had a 
very strange problem with the most recent libraries.

I call star using the following call:

Star_Directory/STAR --genomeDir Star_Directory/STAR_2.3.0/Genome --readFilesIn 
$f $f2 --outSAMstrandField intronMotif --runThreadN 3

where f and f2 are the paired end reads:
1-Nq-C96_S94_L001_R1_001_val_1.fq 
1-Nq-C96_S94_L001_R2_001_val_2.fq

which have been trimmed by trim_galore with the call:
trim_galore -q 15 --phred33 --paired --length 50 -a CTGTCTCTTATACACATCT 
--stringency 3 $f $f2

where f and f2 are the untrimmed fastq files:
1-Nq-C96_S94_L001_R2_001.fastq 
1-Nq-C96_S94_L001_R1_001.fastq 

For these runs the log.out file shows something like this:

                                  Started job on |  Sep 17 13:16:13
                             Started mapping on |   Sep 17 13:17:17
                                    Finished on |   Sep 17 13:17:47
       Mapping speed, Million of reads per hour |   21.76

                          Number of input reads |   181350
                      Average input read length |   179
                                    UNIQUE READS:
                   Uniquely mapped reads number |   1973
                        Uniquely mapped reads % |   1.09%
                          Average mapped length |   176.75
                       Number of splices: Total |   24
            Number of splices: Annotated (sjdb) |   0
                       Number of splices: GT/AG |   23
                       Number of splices: GC/AG |   1
                       Number of splices: AT/AC |   0
               Number of splices: Non-canonical |   0
                      Mismatch rate per base, % |   0.39%
                         Deletion rate per base |   0.04%
                        Deletion average length |   2.22
                        Insertion rate per base |   0.00%
                       Insertion average length |   1.50
                             MULTI-MAPPING READS:
        Number of reads mapped to multiple loci |   948
             % of reads mapped to multiple loci |   0.52%
        Number of reads mapped to too many loci |   22
             % of reads mapped to too many loci |   0.01%
                                  UNMAPPED READS:
       % of reads unmapped: too many mismatches |   0.00%
                 % of reads unmapped: too short |   98.37%
                     % of reads unmapped: other |   0.01%

However looking at the Fastq files it looks like the reads are for the most 
part adequate.
I've attached abreviated versions of the two of the paired end read fastqs.

I've also attached abbreviated versions of two of the paired end fastqs that 
have mapped with a unique mapping percentage of approximately 90% (called 
read1/2_goodMappers.fq)

I am new to RNAseq analysis, so this may be a trivial issue. I am hoping I can 
get any sort of help I can.

I am using STAR 2.3.0 on Mac OSX.

Thanks so much.

Original issue reported on code.google.com by [email protected] on 18 Sep 2014 at 4:17

Attachments:

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2016-01-26T05:24:55Z

It turns out my reads were just bad and they were not mapping to the genome...

Sorry for the trouble, back to making libraries!

Original comment by [email protected] on 13 Oct 2014 at 12:34

GoogleCodeExporter added Priority-Medium auto-migrated Type-Defect labels Jan 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extremely high percentage of reads too short to map... #28

Extremely high percentage of reads too short to map... #28

GoogleCodeExporter commented Jan 26, 2016

GoogleCodeExporter commented Jan 26, 2016

Extremely high percentage of reads too short to map... #28

Extremely high percentage of reads too short to map... #28

Comments

GoogleCodeExporter commented Jan 26, 2016

GoogleCodeExporter commented Jan 26, 2016