diff --git a/notebooks/DA_demo_L96.ipynb b/notebooks/DA_demo_L96.ipynb
index 37d6e61a..d35ba368 100644
--- a/notebooks/DA_demo_L96.ipynb
+++ b/notebooks/DA_demo_L96.ipynb
@@ -10,38 +10,46 @@
   },
   {
    "cell_type": "markdown",
-   "id": "4c383a2a",
+   "id": "c33e3435",
    "metadata": {},
    "source": [
-    "# Recap of L96 and notation"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2b0a10bb",
-   "metadata": {
-    "user_expressions": []
-   },
-   "source": [
-    "{cite}`Lorenz1995` describes a two-time scale dynamical system using two equations which are:\n",
+    "# What is DA? Why do we do it?\n",
+    "\n",
+    "Data assimilation is an approach for combining data (e.g. observations) with prior knowledge (e.g. from a physical model) to obtain an estimate of the true state of a system. We use data assimilation to generate accurate initial conditions for weather and climate predictions, and to produce reanalysis products (state estimates) of the atmosphere-ocean-sea ice-land system. \n",
+    "\n",
+    "Conceptually, data assimilation follows the Bayes' Rule:\n",
     "\n",
     "\\begin{align}\n",
-    "\\frac{d}{dt} X_k\n",
-    "&= - X_{k-1} \\left( X_{k-2} - X_{k+1} \\right) - X_k + F - \\left( \\frac{hc}{b} \\right) \\sum_{j=0}^{J-1} Y_{j,k}\n",
-    "\\\\\n",
-    "\\frac{d}{dt} Y_{j,k}\n",
-    "&= - cbY_{j+1,k} \\left( Y_{j+2,k} - X_{j-1,k} \\right) - c Y_{j,k} + \\frac{hc}{b} X_k\n",
+    "\\ P(X|Obs)=\\frac{P(Obs|X)P(X)}{P(Obs)}\n",
     "\\end{align}\n",
     "\n",
-    "where $X_k$, $k=0,\\ldots,K-1$, denotes the _large scale_ with $K$ degrees of freedom. The $k$ index is periodic, meaning $k=K$ is referring to $k=0$, $k=-1$ is referring to $k=K-1$, and so on. The $j$ indices represent a sub-division of each $k$-element denoting that the $J$ $Y$-values are coupled to a single $X$ value. When $(j+1,k)$ refers to a value beyond $J$, it cycles and refers back to the first value $(1,k+1)$. The slow time-scale queation is forced by the parameter $F$, which determines the chaotic behaviour of the system {cite}`Wilks2005`. The overall structure is illustrated in Fig. 1.\n",
+    "where $X$ is the state of the dynamic system, $P(X|Obs)$ is the posterior distribution, or the analysis in DA terms, and $P(X)$ is the prior distribution, or the forecast in DA terms. $P(Obs|X)$ is the observation distribution given the prior. Simply speaking, we want to know the best estimate of the current state $X$ of the system given the available observations. Below is a simple schematic representation of one DA step.\n",
     "\n",
+    "```{figure} figs/DA_step.png\n",
+    "```\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "208a8f14",
+   "metadata": {},
+   "source": [
+    "# Design of the data assimilation experiments in L96\n",
     "\n",
-    "```{figure} https://www.researchgate.net/publication/319201436/figure/fig1/AS:869115023589376@1584224577926/Visualisation-of-a-two-scale-Lorenz-96-system-with-J-8-and-K-6-Global-scale-values.png\n",
-    ":width: 400px\n",
-    ":name: l96-equation-figure-da-demo\n",
+    "We use the 2-scale Lorenz-96 model as our truth, or our representation of the \"real world\". We then use the 1-scale L96 model as our DA model, or our representation of the \"GCM\". Detailed description of this setup can be find here (https://m2lines.github.io/L96_demo/notebooks/gcm-analogue.html). This is what we call an idealized biased-model framework in DA research, in the sense that our DA model is biased compared to the truth, similar to real-world situations where our GCMs are biased against the natural weather or climate systems. \n",
     "\n",
-    "*Visualisation of a two-scale Lorenz '96 system with J = 8 and K = 6. Global-scale variables ($X_k$) are updated based on neighbouring variables and on the local-scale variables ($Y_{j,k}$) associated with the corresponding global-scale variable. Local-scale variabless are updated based on neighbouring variables and the associated global-scale variable. The neighbourhood topology of both local and global-scale variables is circular. Image from [Exploiting the chaotic behaviour of atmospheric models with reconfigurable architectures - Scientific Figure on ResearchGate.](https://www.researchgate.net/figure/Visualisation-of-a-two-scale-Lorenz-96-system-with-J-8-and-K-6-Global-scale-values_fig1_319201436)*.\n",
-    "```"
+    "We will follow the follwing steps when doing DA in this idealized biased L96 framework:\n",
+    "1. Specify a biased “GCM” model for performing DA, and set the parameters for the \"GCM\", observations, and DA method in `Config`.\n",
+    "    * Parameters to experiment in `Config`: forcing `F`, seasonal cycle, experiment length, \n",
+    "2. Generate a “truth” signal from the 2-scale L96 model\n",
+    "3. Collect sparse and noisy observations, sampled from “truth”\n",
+    "    * Parameters to experiment in `Config`: observation density in space, frequency in time, observational error\n",
+    "4. Perform DA using a method of choice\n",
+    "    * Parameters to experiment in `Config`: DA method (EnKF, 3DVAR, Hybrid EnKF), DA frequency, ensemble size (for EnKF), localization, inflation method and factor\n",
+    "    * Optional: parameter estimation to recover the true forcing\n",
+    "5. Assess DA performance\n",
+    "    * RMSE of DA experiment relative to truth\n",
+    "6. Analyze DA increments to assess GCM model error (optional)\n"
    ]
   },
   {
@@ -49,7 +57,7 @@
    "id": "66584b6f",
    "metadata": {},
    "source": [
-    "# 1. Define variables and functions to use throughout notebook"
+    "# 1. Define our \"GCM\" and DA parameters to use throughout notebook"
    ]
   },
   {
@@ -67,79 +75,11 @@
     "import matplotlib.pyplot as plt\n",
     "import numpy as np\n",
     "from L96_model import L96, RK4, L96_2t_xdot_ydot, L96_eq1_xdot, L96s\n",
+    "import DA_methods\n",
     "\n",
     "rng = np.random.default_rng()"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e2c3e7af",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "class Config:\n",
-    "    # fmt: off\n",
-    "    K = 40                              # Dimension of L96 `X` variables\n",
-    "    J = 10                              # Dimension of L96 `Y` variables\n",
-    "    obs_freq = 10                       # Observation frequency (number of sampling intervals (si) per observation)\n",
-    "    F_truth = 10                        # F for truth signal\n",
-    "    F_fcst = 10                         # F for forecast (DA) model\n",
-    "    GCM_param = np.array([0, 0, 0, 0])  # Polynomial coefficicents for GCM parameterization\n",
-    "    ns_da = 2000                        # Number of time samples for DA\n",
-    "    ns = 2000                           # Number of time samples for truth signal\n",
-    "    ns_spinup = 200                     # Number of time samples for spin up\n",
-    "    dt = 0.005                          # Model timestep\n",
-    "    si = 0.005                          # Truth sampling interval\n",
-    "    seasonal = False                    # Option for adding a seasonal cycle to the forcing in the L96 truth model\n",
-    "    B_loc = 5                           # Spatial localization radius for DA\n",
-    "    DA = \"EnKF\"                         # DA method\n",
-    "    nens = 100                          # Number of ensemble members for DA\n",
-    "    inflate_opt = \"relaxation\"          # Method for DA model covariance inflation\n",
-    "    inflate_factor = 0.2                # Inflation factor\n",
-    "    hybrid_factor = 0.1                 # Inflation factor for hybrid EnKF method\n",
-    "    param_DA = False                    # Switch to parameter estimation with DA\n",
-    "    param_sd = [0.01, 0.02, 0.1, 0.5]   # Polynomal parameter standard deviation\n",
-    "    param_inflate = \"multiplicative\"    # Method for parameter variance inflation\n",
-    "    param_inf_factor = 0.02             # Parameter inflation factor\n",
-    "    obs_density = 0.2                   # Fraction of spatial gridpoints where observations are collected\n",
-    "    DA_freq = 10                        # Assimilation frequency (number of sampling intervals (si) per assimilation step)\n",
-    "    obs_sigma = 0.5                     # Observational error standard deviation\n",
-    "    save_fig = False                    # Switch to save figure file\n",
-    "    save_data = False                   # Switch to save\n",
-    "    # fmt: on"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "3824759d-3678-4915-9793-55595d0989e8",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "def observation_operator(K, l_obs, t_obs, i_t):\n",
-    "    \"\"\"Observation operator to map between model and observation space,\n",
-    "    assuming linearity and model space observations.\n",
-    "\n",
-    "    Args:\n",
-    "        K: spatial dimension of the model\n",
-    "        l_obs: spatial positions of observations on model grid\n",
-    "        t_obs: time positions of observations\n",
-    "        i_t: the timestep of the current DA cycle\n",
-    "\n",
-    "    Returns:\n",
-    "        Operator matrix (K * observation_density, K)\n",
-    "    \"\"\"\n",
-    "    n = l_obs.shape[-1]\n",
-    "    H = np.zeros((n, K))\n",
-    "    H[range(n), l_obs[t_obs == i_t]] = 1\n",
-    "    return H"
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -181,151 +121,40 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "4e65b0ea-ba54-406e-878c-39b46e776022",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "def get_dist(i, j, K):\n",
-    "    \"\"\"Compute the absolute distance between two element indices\n",
-    "    within a square matrix of size (K x K)\n",
-    "\n",
-    "    Args:\n",
-    "        i: the ith row index\n",
-    "        j: the jth column index\n",
-    "        K: shape of square array\n",
-    "\n",
-    "    Returns:\n",
-    "        Distance\n",
-    "    \"\"\"\n",
-    "    return abs(i - j) if abs(i - j) <= K / 2 else K - abs(i - j)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5ab2c12b-3b13-4557-9228-60222de7470c",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "def gaspari_cohn(distance, radius):\n",
-    "    \"\"\"Compute the appropriate distance dependent weighting of a\n",
-    "    covariance matrix, after Gaspari & Cohn, 1999 (https://doi.org/10.1002/qj.49712555417)\n",
-    "\n",
-    "    Args:\n",
-    "        distance: the distance between array elements\n",
-    "        radius: localization radius for DA\n",
-    "\n",
-    "    Returns:\n",
-    "        distance dependent weight of the (i,j) index of a covariance matrix\n",
-    "    \"\"\"\n",
-    "    if distance == 0:\n",
-    "        weight = 1.0\n",
-    "    else:\n",
-    "        if radius == 0:\n",
-    "            weight = 0.0\n",
-    "        else:\n",
-    "            ratio = distance / radius\n",
-    "            weight = 0.0\n",
-    "            if ratio <= 1:\n",
-    "                weight = (\n",
-    "                    -(ratio**5) / 4\n",
-    "                    + ratio**4 / 2\n",
-    "                    + 5 * ratio**3 / 8\n",
-    "                    - 5 * ratio**2 / 3\n",
-    "                    + 1\n",
-    "                )\n",
-    "            elif ratio <= 2:\n",
-    "                weight = (\n",
-    "                    ratio**5 / 12\n",
-    "                    - ratio**4 / 2\n",
-    "                    + 5 * ratio**3 / 8\n",
-    "                    + 5 * ratio**2 / 3\n",
-    "                    - 5 * ratio\n",
-    "                    + 4\n",
-    "                    - 2 / 3 / ratio\n",
-    "                )\n",
-    "    return weight"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "86f93dee-eea4-44e0-9ad6-113ab0912688",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "def localize_covariance(B, loc=0):\n",
-    "    \"\"\"Localize the model climatology covariance matrix, based on\n",
-    "    the Gaspari-Cohn function.\n",
-    "\n",
-    "    Args:\n",
-    "        B: Covariance matrix over a long model run 'M_truth' (K, K)\n",
-    "        loc: spatial localization radius for DA\n",
-    "\n",
-    "    Returns:\n",
-    "        Covariance matrix scaled to zero outside distance 'loc' from diagonal and\n",
-    "        the matrix of weights which are used to scale covariance matrix\n",
-    "    \"\"\"\n",
-    "    M, N = B.shape\n",
-    "    X, Y = np.ix_(np.arange(M), np.arange(N))\n",
-    "    dist = np.vectorize(get_dist)(X, Y, M)\n",
-    "    W = np.vectorize(gaspari_cohn)(dist, loc)\n",
-    "    return B * W, W"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fc3b629d-398b-4b96-9d63-e95eaab8f688",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "def running_average(X, N):\n",
-    "    \"\"\"Compute running mean over a user-specified window.\n",
-    "\n",
-    "    Args:\n",
-    "        X: Input vector of arbitrary length 'n'\n",
-    "        N: Size of window over which to compute mean\n",
-    "\n",
-    "    Returns:\n",
-    "        X averaged over window N\n",
-    "    \"\"\"\n",
-    "    if N % 2 == 0:\n",
-    "        N1, N2 = -N / 2, N / 2\n",
-    "    else:\n",
-    "        N1, N2 = -(N - 1) / 2, (N + 1) / 2\n",
-    "    X_sum = np.zeros(X.shape)\n",
-    "    for i in np.arange(N1, N2):\n",
-    "        X_sum = X_sum + np.roll(X, int(i), axis=0)\n",
-    "    return X_sum / N"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e6a648d4-d5af-419c-bdaa-9dd81b14cffe",
-   "metadata": {
-    "tags": []
-   },
+   "id": "3d2efdbe",
+   "metadata": {},
    "outputs": [],
    "source": [
-    "def find_obs(loc, obs, t_obs, l_obs, period):\n",
-    "    \"\"\"NOTE: This function is for plotting purposes only.\"\"\"\n",
-    "    t_period = np.where((t_obs[:, 0] >= period[0]) & (t_obs[:, 0] < period[1]))\n",
-    "    obs_period = np.zeros(t_period[0].shape)\n",
-    "    obs_period[:] = np.nan\n",
-    "    for i in np.arange(len(obs_period)):\n",
-    "        if np.any(l_obs[t_period[0][i]] == loc):\n",
-    "            obs_period[i] = obs[t_period[0][i]][l_obs[t_period[0][i]] == loc]\n",
-    "    return obs_period"
+    "class Config:\n",
+    "    # fmt: off\n",
+    "    K = 40                              # Dimension of L96 `X` variables\n",
+    "    J = 10                              # Dimension of L96 `Y` variables\n",
+    "    obs_freq = 10                       # Observation frequency (number of sampling intervals (si) per observation)\n",
+    "    F_truth = 10                        # F for truth signal\n",
+    "    F_fcst = 10                         # F for forecast (DA) model\n",
+    "    GCM_param = np.array([0, 0, 0, 0])  # Polynomial coefficicents for GCM parameterization\n",
+    "    ns_da = 2000                        # Number of time samples for DA\n",
+    "    ns = 2000                           # Number of time samples for truth signal\n",
+    "    ns_spinup = 200                     # Number of time samples for spin up\n",
+    "    dt = 0.005                          # Model timestep\n",
+    "    si = 0.005                          # Truth sampling interval\n",
+    "    seasonal = False                    # Option for adding a seasonal cycle to the forcing in the L96 truth model\n",
+    "    B_loc = 5                           # Spatial localization radius for DA in terms of model grid points\n",
+    "    DA = \"EnKF\"                         # DA method\n",
+    "    nens = 500                          # Number of ensemble members for DA\n",
+    "    inflate_opt = \"relaxation\"          # Method for DA model covariance inflation\n",
+    "    inflate_factor = 0.65                # Inflation factor\n",
+    "    hybrid_factor = 0.1                 # Weighting factor for hybrid EnKF method (weight of static B matrix)\n",
+    "    param_DA = False                    # Switch to turn on parameter estimation with DA\n",
+    "    param_sd = [0.01, 0.02, 0.1, 0.5]   # Polynomal parameter standard deviation\n",
+    "    param_inflate = \"multiplicative\"    # Method for parameter variance inflation\n",
+    "    param_inf_factor = 0.02             # Parameter inflation factor\n",
+    "    obs_density = 0.2                   # Fraction of spatial gridpoints where observations are collected\n",
+    "    DA_freq = 10                        # Assimilation frequency (number of sampling intervals (si) per assimilation step)\n",
+    "    obs_sigma = 0.5                     # Observational error standard deviation\n",
+    "    save_fig = False                    # Switch to save figure file\n",
+    "    save_data = False                   # Switch to save\n",
+    "    # fmt: on"
    ]
   },
   {
@@ -449,7 +278,7 @@
     "user_expressions": []
    },
    "source": [
-    "Now, we generate climatological background (temporal) covariance for the 2-scale L96 model"
+    "Now, we generate climatological background (temporal) covariance `B_clim2` for the 2-scale L96 model."
    ]
   },
   {
@@ -526,7 +355,7 @@
     "user_expressions": []
    },
    "source": [
-    "Generating climatological background covariance for 1-scale L96 model"
+    "Generating climatological background covariance `B_clim1` for 1-scale L96 model. This climatological covariance will be used by the 3-dimensional variational (3DVar) DA method, or by the hybrid 3DVar-EnKF method. Since we cannot observe every state variable at every possible location, when assimilting sparse observations into the model, we need to use each observation to update multiple model locations, usually around the observed location. The covariance are thus needed to project the model-observation misfits between different model locations in certain DA methods such as 3DVar and EnKF."
    ]
   },
   {
@@ -553,7 +382,7 @@
    "source": [
     "# 3. Generate synthetic observations\n",
     "\n",
-    "Here we generate a set of sparse observations by sampling from the `X_truth` run and adding some random Gaussian noise."
+    "Here we generate a set of sparse observations by sampling from the `X_truth` run and adding some random Gaussian noise. The observations are stored as individual samples, each containing the time, location, and value."
    ]
   },
   {
@@ -587,7 +416,7 @@
     "user_expressions": []
    },
    "source": [
-    "The covariance matrix over the observations `R` (used to express the uncertainty in the observations during DA) is given as a diagonal matrix with entries defined by the square of the \"observational error standard deviation\"."
+    "The covariance matrix `R` for the observations (used to express the uncertainty in the observations during DA) is given as a diagonal matrix with entries defined by the square of the \"observational error standard deviation\". $R$ is a diagonal matrix because we assume that the errors of different observations are independent. The function `find_obs` is used here to find the locations and values of a set of observations given the observed times. "
    ]
   },
   {
@@ -616,7 +445,7 @@
     "plt.plot(range(1000, 1500), X_truth[1000:1500, 0], label=\"truth\")\n",
     "plt.scatter(\n",
     "    t_obs[100:150, 0],\n",
-    "    find_obs(0, X_obs, t_obs, l_obs, [t_obs[100, 0], t_obs[150, 0]]),\n",
+    "    DA_methods.find_obs(0, X_obs, t_obs, l_obs, [t_obs[100, 0], t_obs[150, 0]]),\n",
     "    color=\"k\",\n",
     "    label=\"obs\",\n",
     ")\n",
@@ -630,36 +459,9 @@
     "user_expressions": []
    },
    "source": [
-    "# 4. Apply Localization to the Background Model Covariance\n",
+    "# Check the Model Background Covariances for the variable `X`\n",
     "\n",
-    "The covariance of the model climatology was computed a-priori from a long run. In this step we apply the localized weighting to the covariance matrix."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5f5501b8-82d7-428a-a030-47d8c50fc4bb",
-   "metadata": {
-    "tags": []
-   },
-   "outputs": [],
-   "source": [
-    "import importlib\n",
-    "\n",
-    "import DA_methods\n",
-    "\n",
-    "importlib.reload(DA_methods);"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "eb095211-2a3f-47c1-899c-ecd4bc4a802d",
-   "metadata": {
-    "tags": [],
-    "user_expressions": []
-   },
-   "source": [
-    "Load pre-calculated climatological background covariance matrix from a long simulation"
+    "The climatological covariances of both the \"truth\" (2-scale L96) and \"GCM\" (1-scale L96) are computed a-priori from a long run. "
    ]
   },
   {
@@ -671,8 +473,6 @@
    },
    "outputs": [],
    "source": [
-    "B_loc, W_clim = localize_covariance(B_clim1, loc=Config.B_loc)\n",
-    "\n",
     "B_corr1 = np.zeros(B_clim1.shape)\n",
     "B_corr2 = np.zeros(B_clim2.shape)\n",
     "for i in range(B_clim1.shape[0]):\n",
@@ -764,6 +564,11 @@
    },
    "outputs": [],
    "source": [
+    "# Localize the covariance matrix in advance to avoid the calculation at each DA step\n",
+    "# The localization weight matrix is also saved for EnKF\n",
+    "\n",
+    "B_loc, W_clim = DA_methods.localize_covariance(B_clim1, loc=Config.B_loc)\n",
+    "\n",
     "if Config.param_DA:\n",
     "    mean_param = np.zeros((int(Config.ns_da / Config.DA_freq), len(Config.GCM_param)))\n",
     "    spread_param = np.zeros((int(Config.ns_da / Config.DA_freq), len(Config.GCM_param)))\n",
@@ -822,7 +627,7 @@
     "    # Call DA\n",
     "    t_DA[cycle] = t_truth[i_t]\n",
     "    if Config.DA == \"EnKF\":\n",
-    "        H = observation_operator(Config.K, l_obs, t_obs, i_t)\n",
+    "        H = DA_methods.observation_operator(Config.K, l_obs, t_obs, i_t)\n",
     "\n",
     "        # Augment state vector with parameters when doing parameter estimation\n",
     "        if Config.param_DA:\n",
@@ -851,7 +656,7 @@
     "            ens_param = X_post[-len(Config.GCM_param) :, :]\n",
     "\n",
     "    elif Config.DA == \"HyEnKF\":\n",
-    "        H = observation_operator(Config.K, l_obs, t_obs, i_t)\n",
+    "        H = DA_methods.observation_operator(Config.K, l_obs, t_obs, i_t)\n",
     "        B_ens = (\n",
     "            np.cov(X_prior) * (1 - Config.hybrid_factor)\n",
     "            + B_clim1 * Config.hybrid_factor\n",
@@ -864,7 +669,7 @@
     "\n",
     "    elif Config.DA == \"3DVar\":\n",
     "        X_prior = np.squeeze(X_prior)\n",
-    "        H = observation_operator(Config.K, l_obs, t_obs, i_t)\n",
+    "        H = DA_methods.observation_operator(Config.K, l_obs, t_obs, i_t)\n",
     "        X_post = DA_methods.Lin3dvar(X_prior, X_obs[t_obs == i_t], H, R, B_loc, 3)\n",
     "\n",
     "    elif Config.DA == \"Replace\":\n",
@@ -968,7 +773,9 @@
     "X_inc_ave = X_inc / Config.DA_freq / Config.si\n",
     "axes[0, 2].plot(M_truth.k, X_inc_ave.mean(axis=(0, -1)), label=\"Inc\")\n",
     "axes[0, 2].plot(\n",
-    "    M_truth.k, running_average(X_inc_ave.mean(axis=(0, -1)), 7), label=\"Inc Ave\"\n",
+    "    M_truth.k,\n",
+    "    DA_methods.running_average(X_inc_ave.mean(axis=(0, -1)), 7),\n",
+    "    label=\"Inc Ave\",\n",
     ")\n",
     "axes[0, 2].plot(\n",
     "    M_truth.k,\n",
@@ -1000,7 +807,7 @@
     ")\n",
     "axes[1, 0].scatter(\n",
     "    t_DA[plot_start_DA - 1 : plot_end_DA - 1],\n",
-    "    find_obs(plot_x, X_obs, t_obs, l_obs, [plot_start, plot_end]),\n",
+    "    DA_methods.find_obs(plot_x, X_obs, t_obs, l_obs, [plot_start, plot_end]),\n",
     "    label=\"obs\",\n",
     ")\n",
     "axes[1, 0].grid(which=\"both\", linestyle=\"--\")\n",
@@ -1012,7 +819,7 @@
     "    for i, c in zip(np.arange(len(Config.GCM_param), 0, -1), [\"r\", \"b\", \"g\", \"k\"]):\n",
     "        axes[1, 1].plot(\n",
     "            t_DA,\n",
-    "            running_average(mean_param[:, i - 1], 100),\n",
+    "            DA_methods.running_average(mean_param[:, i - 1], 100),\n",
     "            c + \"-\",\n",
     "            label=\"C{} {:3f}\".format(\n",
     "                i - 1, mean_param[int(len(t_DA) / 2) :, i - 1].mean()\n",
@@ -1020,13 +827,17 @@
     "        )\n",
     "        axes[1, 1].plot(\n",
     "            t_DA,\n",
-    "            running_average(mean_param[:, i - 1] + spread_param[:, i - 1], 100),\n",
+    "            DA_methods.running_average(\n",
+    "                mean_param[:, i - 1] + spread_param[:, i - 1], 100\n",
+    "            ),\n",
     "            c + \":\",\n",
     "            label=f\"SD {spread_param[int(len(t_DA) / 2):, i - 1].mean():3f}\",\n",
     "        )\n",
     "        axes[1, 1].plot(\n",
     "            t_DA,\n",
-    "            running_average(mean_param[:, i - 1] - spread_param[:, i - 1], 100),\n",
+    "            DA_methods.running_average(\n",
+    "                mean_param[:, i - 1] - spread_param[:, i - 1], 100\n",
+    "            ),\n",
     "            c + \":\",\n",
     "        )\n",
     "    axes[1, 1].legend()\n",
@@ -1114,14 +925,6 @@
     "        X_inc_ave=X_inc_ave,\n",
     "    )"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1e96ab6d-95db-4882-ab8c-687b0f9c68bd",
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {
@@ -1140,7 +943,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.16"
+   "version": "3.8.12"
   }
  },
  "nbformat": 4,
diff --git a/notebooks/DA_methods.py b/notebooks/DA_methods.py
index e35faba5..6f49fe10 100644
--- a/notebooks/DA_methods.py
+++ b/notebooks/DA_methods.py
@@ -1,11 +1,135 @@
 """ Data assimilation methods
-Adapted form PyDA project: https://github.com/Shady-Ahmed/PyDA
+Partly adapted form PyDA project: https://github.com/Shady-Ahmed/PyDA
 Reference: https://www.mdpi.com/2311-5521/5/4/225
 """
 import numpy as np
 from numba import njit
 
 
+def observation_operator(K, l_obs, t_obs, i_t):
+    """Observation operator to map between model and observation space,
+    assuming linearity and model space observations.
+
+    Args:
+        K: spatial dimension of the model
+        l_obs: spatial positions of observations on model grid
+        t_obs: time positions of observations
+        i_t: the timestep of the current DA cycle
+
+    Returns:
+        Operator matrix (K * observation_density, K)
+    """
+    n = l_obs.shape[-1]
+    H = np.zeros((n, K))
+    H[range(n), l_obs[t_obs == i_t]] = 1
+    return H
+
+
+def get_dist(i, j, K):
+    """Compute the absolute distance between two element indices
+    within a square matrix of size (K x K)
+
+    Args:
+        i: the ith row index
+        j: the jth column index
+        K: shape of square array
+
+    Returns:
+        Distance
+    """
+    return abs(i - j) if abs(i - j) <= K / 2 else K - abs(i - j)
+
+
+def gaspari_cohn(distance, radius):
+    """Compute the appropriate distance dependent weighting of a
+    covariance matrix, after Gaspari & Cohn, 1999 (https://doi.org/10.1002/qj.49712555417)
+
+    Args:
+        distance: the distance between array elements
+        radius: localization radius for DA
+
+    Returns:
+        distance dependent weight of the (i,j) index of a covariance matrix
+    """
+    if distance == 0:
+        weight = 1.0
+    else:
+        if radius == 0:
+            weight = 0.0
+        else:
+            ratio = distance / radius
+            weight = 0.0
+            if ratio <= 1:
+                weight = (
+                    -(ratio**5) / 4
+                    + ratio**4 / 2
+                    + 5 * ratio**3 / 8
+                    - 5 * ratio**2 / 3
+                    + 1
+                )
+            elif ratio <= 2:
+                weight = (
+                    ratio**5 / 12
+                    - ratio**4 / 2
+                    + 5 * ratio**3 / 8
+                    + 5 * ratio**2 / 3
+                    - 5 * ratio
+                    + 4
+                    - 2 / 3 / ratio
+                )
+    return weight
+
+
+def localize_covariance(B, loc=0):
+    """Localize the model climatology covariance matrix, based on
+    the Gaspari-Cohn function.
+
+    Args:
+        B: Covariance matrix over a long model run 'M_truth' (K, K)
+        loc: spatial localization radius for DA
+
+    Returns:
+        Covariance matrix scaled to zero outside distance 'loc' from diagonal and
+        the matrix of weights which are used to scale covariance matrix
+    """
+    M, N = B.shape
+    X, Y = np.ix_(np.arange(M), np.arange(N))
+    dist = np.vectorize(get_dist)(X, Y, M)
+    W = np.vectorize(gaspari_cohn)(dist, loc)
+    return B * W, W
+
+
+def running_average(X, N):
+    """Compute running mean over a user-specified window.
+
+    Args:
+        X: Input vector of arbitrary length 'n'
+        N: Size of window over which to compute mean
+
+    Returns:
+        X averaged over window N
+    """
+    if N % 2 == 0:
+        N1, N2 = -N / 2, N / 2
+    else:
+        N1, N2 = -(N - 1) / 2, (N + 1) / 2
+    X_sum = np.zeros(X.shape)
+    for i in np.arange(N1, N2):
+        X_sum = X_sum + np.roll(X, int(i), axis=0)
+    return X_sum / N
+
+
+def find_obs(loc, obs, t_obs, l_obs, period):
+    """NOTE: This function is for plotting purposes only."""
+    t_period = np.where((t_obs[:, 0] >= period[0]) & (t_obs[:, 0] < period[1]))
+    obs_period = np.zeros(t_period[0].shape)
+    obs_period[:] = np.nan
+    for i in np.arange(len(obs_period)):
+        if np.any(l_obs[t_period[0][i]] == loc):
+            obs_period[i] = obs[t_period[0][i]][l_obs[t_period[0][i]] == loc]
+    return obs_period
+
+
 @njit
 def Lin3dvar(ub, w, H, R, B, opt):
     # The solution of the 3DVAR problem in the linear case requires
diff --git a/notebooks/figs/DA_step.png b/notebooks/figs/DA_step.png
new file mode 100644
index 00000000..937fecb9
Binary files /dev/null and b/notebooks/figs/DA_step.png differ