-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
distiller (bwa mem) cannot handle Hi-C data sequenced with 25bp #141
Comments
Bowtie2 command that did work for mapping 25bp reads: Thank you! |
Is there currently a supported sequencer that produces 25 bp reads? I haven't heard of any data in the last ~5 years with such short reads! |
It's just regular MiSeq paired end run. Since I'm working with a small genome and just testing several conditions, I figured it would be sufficient. But now I wish I would have used longer reads (>50bp)... |
Author of |
this seems important!
Not sure what would be the best way to deal with this issue. After a bit of
googling, it does seem that aln would do a better job with shorter reads:
https://sourceforge.net/p/bio-bwa/mailman/message/34978319/
http://crazyhottommy.blogspot.com/2017/06/bwa-aln-or-bwa-mem-for-short-reads-36bp.html
Importantly, mapping with aln is two-step: first, you run bwa aln and get
.sai files, then you run bwa samse/sampe and convert ,sai files to .sams.
We could introduce a switch to the logic of distiller if you feel that it
could be important!
Anton.
…On Wed, 15 May 2019 at 09:36, Ilya Flyamer ***@***.***> wrote:
Author of minimap says bwa-mem should be better for Hi-C.
"Furthermore, I also realize that bwa-mem will be better than minimap2 at
Hi-C alignment because bwa-mem is more sensitive to short matches."
https://lh3.github.io/2018/04/02/minimap2-and-the-future-of-bwa
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#141>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAG64CSKJPRKQEIER376XSTPVQGWTANCNFSM4HM3X7BQ>
.
|
@golobor @Phlya @Marlies1993 after a little chat with Job - this is no longer URGENT, but should probably be addressed anyways - couple of arguments:
in our case it seems like
it's a shame it's not "plug n play" between |
Thank you for this update! These arguments seem very reasonable, indeed. |
Hi, this is not the problem for MiSeq small reads only. In methods like Hi-CO (https://doi.org/10.1016/j.cell.2018.12.014) the length of meaningful read parts is required to be 15-36 bp due to complex ligation procedure and adaptors trimming. bwa aln -t ${bwa_threads} ${bwa_index_base} ${fastq1} > ${library}.${run}.${ASSEMBLY_NAME}.${chunk}.1.sai
bwa aln -t ${bwa_threads} ${bwa_index_base} ${fastq2} > ${library}.${run}.${ASSEMBLY_NAME}.${chunk}.2.sai
bwa sampe ${bwa_index_base} ${library}.${run}.${ASSEMBLY_NAME}.${chunk}.1.sai ${library}.${run}.${ASSEMBLY_NAME}.${chunk}.2.sai ${fastq1} ${fastq2} \
| pairtools parse ${dropsam_flag} ${dropreadid_flag} ${dropseq_flag} \
${parsing_options} \
-c ${chrom_sizes} \
| pairtools sort --nproc ${sorting_threads} \
-o ${library}.${run}.${ASSEMBLY_NAME}.${chunk}.pairsam.${suffix} \
--tmpdir \$TASK_TMP_DIR \
| cat``` |
Hi, there is now a test distiller option for short reads mapping with bwa aln (see long_reads option of map): Any suggestions/improvements and testing are much appreciated! |
Rather major stuff! as discovered by @Marlies1993
bwa mem
yields empty alignments for sequences of 25bp ...apparently it isn't designed for 25bp, and something like
bwa aln
should be used instead ...@golobor @nvictus @mimakaev any thought on that ? switch to
minimap
? I know nothing aboutbwa aln
...we'll try to proceed with
bowtie2
for this dataset and will try to avoid 25bp in future, but I think this issue is important enough to consider and addressThe text was updated successfully, but these errors were encountered: