Skip to content

Latest commit

 

History

History
17 lines (11 loc) · 2.48 KB

README.md

File metadata and controls

17 lines (11 loc) · 2.48 KB

Sequence Similarity Searching

These materials support a workshop on NCBI BLAST initially taught in Fall 2024, as part of the series Foundations in Genome Analyses, hosted by NUIT Research Computing and Data Services. These materials were prepared and presented by Pamela Shaw of Galter Health Sciences Library.

Introduction and Workshop Plan

Sequence similarity searching using BLAST (Basic Local Alignment Search Tool) is a long-standing practice in genome analysis. While newer methods in genome sequencing dominate bioinformatics analyses, BLAST is still used for discovering evolutionarily similar species' protein or nucleotide molecules. If you have characterized a new gene in a bacterium, for example, you might want to find if the same sequence exists in other bacteria. Most recently, BLAST has played an important role in discovering orthologs to tCOVID-19 SARS-CoV-2 virus proteins that are similar in other species, to shed light on the virus's evoltuion.

We'll be using NCBI's BLAST suite of tools. We won't have time to explore them all, but we will use Nucleotide BLAST (blastn) and Protein BLAST (blastp). There are lots of other BLAST tools as well, including translated BLAST (tblastn or blastx), BLAST genomes, Primer BLAST, BLAST for just immunoglobulins or conserved domains, and some alignment tools. The BLAST page also has a downloadable version of BLAST for use on your server or local machine, a BLAST API tool and a cloud BLAST launch.

We are going to start with a DNA BLAST on a "Mystery sequence" to illustrate how Nucleotide BLAST can quickly find similar sequences for relatively short queries.

Then, we are going to use a published manuscript on a mutation in the spike protein in SARS-CoV-2 as our starting point, and find the sequences available in the BLAST databases that contain the specific mutation in human COVID-19, and also (time permitting), try to discover how many other species' viruses may contain a similar spike protein receptor binding domain. This particular BLAST search can be very computationally taxing on the NCBI BLAST servers, so we have a backup BLAST search that we can use if we run into heavy server delays.

Support Materials

The NCBI has lots of support materials available in many formats: