Skip to content

Calculations of accuracy comparing Williams lab simulations to RFmix runs

Notifications You must be signed in to change notification settings

eatkinson/LAIaccuracy

Repository files navigation

LAIaccuracy

Calculations of accuracy comparing Williams lab simulations to RFmix runs

Python script to calculate accuracy of RFmix ancestral calls against the simulated truth ancestral calls. As input, the program expects two files.

  1. An ancestral truth file with the chr, bp position, and space delimited phased ancestry calls. It is current hard-coded to expect 50 haplotypes (25 individuals) and a 2-way admixed scenario (ancestry calls can be 0 or 1). An example including two individuals at two SNPs:
1 570178 1 1 1 1
1 752566 1 1 0 1

I processed the bp-converted simulated output from Williams lab simu-mix program to generate my truth dataset.

  1. An RFmix output msp file.

These can both be for a single chromosome or the entire genome; the script matches first on chromosome and then will find the window of the RFmix output that the truth bp location fits in to verify call accuracy.

Usage:

python GlobalLAIaccuracy-userflags.py --Anc [TRUTH_FILE] --msp [RFMIX_MSPFILE]

Other files in this repository were hard coded for a specific project involving a very multi-way admixed population. These could be manually adapted for other use.

About

Calculations of accuracy comparing Williams lab simulations to RFmix runs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages