Skip to content

Latest commit

 

History

History
88 lines (71 loc) · 3.83 KB

README.md

File metadata and controls

88 lines (71 loc) · 3.83 KB

rSNPdata

R package to manage and analyse Plasmodium falciparum whole genome SNPs data.

requirements

before installing this package, make sure the following tools are installed:

tools

bcftools
tabix
vcflib
vcftools
BEDOPS


Index:


Install

library(devtools)
devtools::install_github("FellouMada/rSNPdata", build_vignettes = TRUE)
library(rSNPdata)

Manual

browseVignettes("rSNPdata")

DESCRIPTION

function name description
get_snpdata Create SNPdata onject. the functions in this package require a SNPdata object. This can be generated with this function
print print the SNPdata object
compute_MAF calculate the snp minor allele frequency based on the allelic depth and minor allele frequency
calculate_Fws calculate the within-host genetic diversity index

FILTRATION

function name description
filter_snps_samples filter loci and samples from the SNPdata object
select_chrom select data for a provided list of chromosomes
drop_snps remove a set of SNPs from the SNPdata object
drop_samples remove a set of samples from the SNPdata object

TRANSFORMATION

function name description
phase_mixed_genotypes Phase mixed genotypes based on number of read supporting each allele and Bernoulli distribution. This process will be repeated nsim times and data from iteration with highest correlation between MAF phased data and MAF raw data will be retained
impute_missing_genotypes impute missing genotypes based on minor allele frequency and Bernoulli distribution. This process will be repeated nsim times and data from iteration with highest correlation between MAF imputed data and MAF raw data will be retained

PARAMETERS

function name description
calculate_wcFst Calculate Weir & Cockerham's Fst. This is achieved using the vcflib tools
calculate_LD Calculate LD R^2 between all pair of loci using vcftools
calculate_IBS Calculate identity by state matrix
calculate_iR Calculate the iR index between pairs of populations to determine loci with excess of IBD sharing. The calculation is based on the isoRelate R package
calculate_relatedness Calculate relatedness between every pair od isolates for all pairs of population. This is based on the model developed by Aimee R. Taylor and co-authors