You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have 12 genome and gff files, and already built a graph-based pangenome with SV vcf file by vg construct and vg index.
I also have some RNA-seq data, and want to align RNA-seq data to graph pangenome.
In my opinion, it seems that I should re-build a graph pangenome using vg autoindex -w mpmap -v sv.vcf.gz rather than using above index. But for options -r and --tx-gff which may repeat in vg autoindex, should I use just one genome as ref or all of 12 genomes?
I hope for your response.
Thanks!
The text was updated successfully, but these errors were encountered:
vg autoindex is designed to take common interchange formats like FASTA and VCF and produce internal vg formats like the ones you get from vg index. So, yes, you would not use your already-constructed indexes if you want to use vg autoindex.
Most users starting from a VCF+FASTA will only have GFFs for the reference sequence, so I'm not sure what your 12 GFFs look like. VCF doesn't always neatly preserve contig coordinates, so I think it would be very difficult to get sensible results using haplotype-specific GFFs. Certainly, the pipeline is better tested and hardened using one GFF. The reason we allow multiple GFF inputs is more to accommodate users who have GFFs that are split up by chromosome.
Thanks for this.
We assemblied 12 genomes and annotated them, so we have multiple FASTA and GFF files.
I will only use one genome and its related gff file as input for vg autoindex.
Best!
If you build a graph using the raw assemblies (e.g. using Minigraph-Cactus), you could also supply a GFA file containing the haplotypes and then also provide the individual haplotype annotations to vg autoindex using --hap-tx-gff.
Hi, vg is a great software in pangenome.
I have 12 genome and gff files, and already built a graph-based pangenome with SV vcf file by vg construct and vg index.
I also have some RNA-seq data, and want to align RNA-seq data to graph pangenome.
In my opinion, it seems that I should re-build a graph pangenome using vg autoindex -w mpmap -v sv.vcf.gz rather than using above index. But for options -r and --tx-gff which may repeat in vg autoindex, should I use just one genome as ref or all of 12 genomes?
I hope for your response.
Thanks!
The text was updated successfully, but these errors were encountered: