README.QPGRAPH

qpGraph -p parfile -g graph [-o outgraph] [-d dotfile] 

See examples/runit for some simple examples and sample output

Sample parameter file
DIR:   ../data
S1:              sim1           
indivname:       DIR/S1.ind  
snpname:         DIR/S1.snp       
genotypename:    DIR/S1.geno  
outpop:         Out     
blgsize:        0.05
## block size in Morgans for Jackknife
## see below 
lsqmode:       YES
diag:          .0001
hires:         YES
initmix:      1000 
precision:    .0001  
zthresh:      3.0
terse:        NO
useallsnps:   NO


-p parfile specifies input data (any Reich lab format) 

outpop:   zzz
f-stats normalized by heterozygosity in population zzz 
outpop:  NULL  has special meaning : no normalization 
blgsize: block size for Jackknife (Morgans)
lsqmode:   Default is NO;  Estimate of error covariance is used.  Unstable for large problems 
 and instead we fit f_3 stats with "base" population first label (see graph spec)
diag:   small fudge factor to add on diagonal of f3.  Improves stability
Given admixing weights the edge lengths (drifts) are set by minimizing 
a degree 2 function, subject to non-negativity of the lengths.  
Initially the program tries V initial tries of admixing weights 
defailt value for V is 10 . 2^n where there are n admixtures in the graph. 
To override this default

initmix:   V  
where V is the desired number of initial tries.  This may be useful 
if you are worried the program may not have found the global optimum. 

hires:   controls output sometimes more decimals are desired
precision:  controls accuracy of recovereed admixture weights.  Coefficients will be within "precision" of MLE estimate.
zthresh: controls outlier output.  All f-stats are calculated and those diverging more then "zthresh" from fitted values 
 printed.e all SNPs even if some populations hav e n data for a SNP (default NO) 

*** new feature *** 
admixin:  <admixwts file>  
 contains lines such as
admix T S1 S2 w1 w2 
 where T is a target node, S1, S2 are source nodes and w1 w2 are weights.  qpGraph  
 will rescale these to sum to 1. 
These lines are in the same format as the output graph [-o option below] and such a graph can be used 
 (only admix lines are recognized) 
admixout:  <admixwts file> 
This file is of course easy to hand edit. 
If admixin is present then initial try estimation is omitted which will make qpGraph run 
 very much faster.  
inbreed:  YES 
Genotypes are expected to be pseudo-haploid -- 2 samples at least per population or drift lengths on 
leaves are not meaningful.  


-g graph  
See examples.qpGraph/gr1x 
Note comments can be included in file. Begin a line with #

[-o outgraph]  This can be hand edited and input again.  Note format is different. 
qpGraph will accept either style.  

[-d dotfile]  Graph output in "dot format";  dot -Tps < dotfile > dotfile.ps gives postscript
"dot" must be installed (part of Graphviz) 

*** new feature ***  
Input can now be an fstats file -- see README.qpfstats
*** new feature *** 
halfscore (won't work in combination with fstats)
halfscore:  YES 
halfjackname:  <output file> 
Does a VERY conservative goodness of fit test.  See halfscore.pdf for details.


===================================================================================================
Utility program: qpreroot
If the input graph is not connected or has loops qpGraph will die unpleasantly.  
Best way to see what is happening: 

qpreroot -g graph -d dotfile  
## This also makes postscript

We also can change the root node 
qpreroot -g graph -o outgraph -r node