Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Pair Trading Study Using Linear Regression #6

Open
wants to merge 23 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions linear_regression.q
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// Alpha and beta functions
// These formulas are derived from Ordinary Least Squares (OLS) regression for one variable
// OLS regression is a method used to estimate the relationship between a dependent variable (y) and one or more independent variables (x).

// @kind function
// @desc Function to calculate the beta coefficient (slope) using OLS
// The beta coefficient (slope) represents the change in the dependent variable for a one-unit change in the independent variable.
// This function computes the beta coefficient using the formula:
// beta = ((n * Σ(x*y)) - (Σx * Σy)) / ((n * Σ(x^2)) - (Σx)^2)
// @param x {number[]} Independent variable
// @param y {number[]} Dependent variable
// @return {number} Beta (slope)
betaF:{dot:{sum x*y};
chraberturas marked this conversation as resolved.
Show resolved Hide resolved
((n*dot[x;y])-(*/)(sum')(x;y))%
((n:count[x])*dot[x;x])-sum[x]xexp 2};


// @kind function
// @desc Function to calculate the alpha coefficient (intercept) using OLS
// The alpha coefficient (intercept) represents the value of the dependent variable when the independent variable is zero.
// This function computes the alpha coefficient using the formula:
// alpha = Mean(y) - beta * Mean(x)
// @param x {number[]} Independent variable
// @param y {number[]} Dependent variable
// @return {number} alpha (intercept)
alphaF: {avg[y]-(betaF[x;y]*avg[x])};
68 changes: 68 additions & 0 deletions streamPair.q
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
// import linear regression module
\l linear_regression.q

// load tables
tab1: 1_ flip `dateTime`bid`ask`bidVol`askVol!("*FFFF";",") 0: `:data/USA500IDXUSD.csv;
tab2: 1_ flip `dateTime`bid`ask`bidVol`askVol!("*FFFF";",") 0: `:data/USATECHIDXUSD.csv;
tab3: flip `dateTime`spread`mean`up`low`ewma`up2`low2!("P"$();"F"$();"F"$();"F"$();"F"$();"F"$();"F"$();"F"$());
historial_tab1: 1_ flip `open`high`low`close`adjClose`vol!("FFFFFF";",") 0: `:data/SP500_hist.csv;
historial_tab2: 1_ flip `open`high`low`close`adjClose`vol!("FFFFFF";",") 0: `:data/NASDAQ100_hist.csv;

// Fix data and take log(prices)
priceX: 0!1_(update delta:0f^deltas dateTime from distinct select distinct dateTime, log bid, log ask from update dateTime:"P"$@[;19;:;"."] each dateTime from tab1);
priceY: 0!1_(update delta:0f^deltas dateTime from distinct select distinct dateTime, log bid, log ask from update dateTime:"P"$@[;19;:;"."] each dateTime from tab2);

// Calculate alpha and beta from historical values
beta_lr: betaF[px:-100#log historial_tab1`close;py:-100#log historial_tab2`close]; // we only take most recent 100 values for the alpha and beta
alpha_lr: alphaF[px;py];
// We calculate an historical standard deviation
std_lr: dev[(1000#exec bid from priceY) - (1000#exec bid from priceX)];

/ load and initialize kdb+tick
/ all tables in the top level namespace (.) become publish-able
\l tick/u.q
Kokechacho marked this conversation as resolved.
Show resolved Hide resolved
.u.init[];

// Read and write on buffer functions
.ringBuffer.read:{[t;i] $[i<=count t; i#t; i rotate t] }
.ringBuffer.write:{[t;r;i] @[t;(i mod count value t)+til 1;:;r];}

// Initialize index and empty tables (We will access directly to these objects from dashboards)
.streamPair.i:-1;
.streamPair.iEWMA:-1;
.streamPair.priceX: 1000#tAux: 1_1#priceX;
.streamPair.priceY: 1000#tAux;
.streamPair.spreads: 1000#tab3;

// Timer function
timer:{t:.z.p;while[.z.p<t+x&abs x-16*1e6]} / 16 <- timer variable

.streamPair.genPair:{
// We wait some delta
d: `float$(priceX[.streamPair.i+:1][`delta]);
timer[d];
// We take the i element from our tables
resX: enlist priceX[.streamPair.i];
resY: enlist priceY[.streamPair.i];

// We calculate spreads for linear regression
s: priceY[.streamPair.i][`bid] - ((priceX[.streamPair.i][`bid] * beta_lr)+alpha_lr);
ewma: dev[ema[0.04; .streamPair.iEWMA#0f^(exec spread from .streamPair.spreads)]];
$[.streamPair.iEWMA>999;.streamPair.iEWMA:998;.streamPair.iEWMA+:1];
resSpread: enlist `dateTime`spread`mean`up`low`ewma`up2`low2!("p"$(priceX[.streamPair.i][`dateTime]);"f"$(s);"f"$(0);"f"$(1.96*std_lr);"f"$(-1.96*std_lr);"f"$0f^(ewma); "f"$(0f^(1.96*(last 1000 mdev (exec spread from .streamPair.spreads)))); "f"$0f^(-1.96*(last 1000 mdev (exec spread from .streamPair.spreads))));

// We update our buffer tables with those values
.ringBuffer.write[`.streamPair.priceX;resX;.streamPair.i];
.ringBuffer.write[`.streamPair.priceY;resY;.streamPair.i];
.ringBuffer.write[`.streamPair.spreads;resSpread;.streamPair.i];
resX

}

// Publish stream updates each milisecond
.z.ts: {.streamPair.genPair[]]}

// Snapshot read from our buffer
.u.snap:{[t] .ringBuffer.read[.streamPair.priceX;.streamPair.i]} // reqd. by dashboards

\t 16
18 changes: 18 additions & 0 deletions tick/u.q
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
/2019.06.17 ensure sym has g attr for schema returned to new subscriber
/2008.09.09 .k -> .q
/2006.05.08 add

\d .u
init:{w::t!(count t::tables`.)#()}

del:{w[x]_:w[x;;0]?y};.z.pc:{del[;x]each t};

sel:{$[`~y;x;select from x where sym in y]}

pub:{[t;x]{[t;x;w]if[count x:sel[x]w 1;(neg first w)(`upd;t;x)]}[t;x]each w t}

add:{$[(count w x)>i:w[x;;0]?.z.w;.[`.u.w;(x;i;1);union;y];w[x],:enlist(.z.w;y)];(x;$[99=type v:value x;sel[v]y;@[0#v;`sym;`g#]])}

sub:{if[x~`;:sub[;y]each t];if[not x in t;'x];del[x].z.w;add[x;y]}

end:{(neg union/[w[;;0]])@\:(`.u.end;x)}