monkey_proteins

Coding with all analyses performed in a project that aimed to study monkey proteins in a large number of genomes.

Firstly, I downloaded all required genomes (listed on file genomes_download_links.csv) using their curl links. To see the full script, go to

Script 01.Downloading_ENA_genomes

Before blasting selected proteins against the genomes, I corrected their sequence format, since bases from the sequences were split into several lines and this can cause a lot of trouble later on if unnoticed. I just copied them in the Firefox search bar to correct this, and them copied the sequences along with their headers to files I created using vim. This and the processing I performed on the downloaded genomes can be seen in

Script 02.Processing_protANDgenomes

After processing our queries and the genomes that will form our genomic database, we'll BLAST the protein nucleotide sequences against the genomes downloaded (full list with names and taxonomic assignation of each assembly used to compose our genomic database can be found in the Genomes_DB_list.tsv file). For that, go to

Script 03.BLAST_proteins

After BLASTing our proteins of interest against the monkey genomes, we'll separate short hits from large hits, as well as only consider those results whose e-value is smaller than 0.001. For that, go to

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
proteins		proteins
01.Downloading_ENA_genomes.txt		01.Downloading_ENA_genomes.txt
02.Processing_protANDgenomes.txt		02.Processing_protANDgenomes.txt
03.BLAST_proteins.txt		03.BLAST_proteins.txt
04.Processing_BLAST_output.txt		04.Processing_BLAST_output.txt
Genomes_DB_list.tsv		Genomes_DB_list.tsv
Genomes_left_to_download.txt		Genomes_left_to_download.txt
README.md		README.md
genomes_download_links.csv		genomes_download_links.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

monkey_proteins

Script 01.Downloading_ENA_genomes

Script 02.Processing_protANDgenomes

Script 03.BLAST_proteins

Script 04. Processing_BLAST_output

About

Releases

Packages

joaoordine/monkey_proteins

Folders and files

Latest commit

History

Repository files navigation

monkey_proteins

Script 01.Downloading_ENA_genomes

Script 02.Processing_protANDgenomes

Script 03.BLAST_proteins

Script 04. Processing_BLAST_output

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages