Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annovar annotates bacterial genome variation #258

Open
ChenDepp opened this issue Oct 8, 2024 · 3 comments
Open

annovar annotates bacterial genome variation #258

ChenDepp opened this issue Oct 8, 2024 · 3 comments

Comments

@ChenDepp
Copy link

ChenDepp commented Oct 8, 2024

hi @d0ugal
I hava question about annovar, As we known, the gene structures of eukaryotes and prokaryotes are different. so How to use annovar to annotate the mutations identified in bacterial genomes?
have a good day.
thanks

@kaichop
Copy link
Contributor

kaichop commented Oct 8, 2024

@ChenDepp
Copy link
Author

ChenDepp commented Oct 8, 2024

@kaichop
thanks for you reply!
I have built my own annotation database, but prokaryotes such as bacteria do not have exons, but the annotation results have exon annotation information. At the same time, the codon table of prokaryotes is different from that of eukaryotes. Will this affect the annotation information such as synonymous mutations and nonsense mutations?

@kaichop
Copy link
Contributor

kaichop commented Oct 15, 2024

Hi @ChenDepp The current ANNOVAR only includes codon table for eukaryotes and mitochondria, so it does not have the full codon table for all possible scenarios. However, unless there is a known exception, bacteria do use the same codon table as eukaryotes, except that the start codon AUG encodes formylmethionine rather than methionine; still there is no real difference in the annotation process. This does not affect synonymous mutation and nonsense mutation.

For the gene annotation database, you can treat each gene in the operon as a single exon gene, so the annotation can still work just fine.

There is a more complex issues with SARS-CoV-2, where you can say all peptides are from the same ORF1ab gene, or you can say that there is one single large gene/protein which was processed to be multiple peptides such as ORF2, ORF4, etc. ANNOVAR handles both scenarios. Bacteria typically does not have this type of complication, as each gene in an operon works independently as a single protein product, so as long as you know the start and end position, you can treat it as a gene and do annotation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants