Skip to content

Latest commit

 

History

History
63 lines (53 loc) · 2.36 KB

README.md

File metadata and controls

63 lines (53 loc) · 2.36 KB

Estimating under-ascertainment of symptomatic COVID-19 cases over time

This repository contains code to run the inference framework that estimates time-varying ascertainment rates of COVID-19 cases (Russell et al.). To do so, we use a Gaussian Process modelling framework, fit to the confirmed COVID-19 death time series for the country or region in question (see Russell et al. for more details on the methods and limitations involved).

To run the code, first of all clone this repository, using the command

git clone https://github.com/thimotei/CFR_calculation

The time-varying estimates result from fitting a Guassian Process model, which is implemented in the R libraries greta and greta.gp. These need to be run from a virtual environment, which is taken care of in the script the model is run from. Specifically, the user needs to run the following commands to ensure the necessary packages are installed

install.packages(c("reticulate", "greta", "greta.gp"))

reticulate is required for a virtual environment to python, as greta requires a virtual environment, as it uses tensorflow called from this virtual environment.

The user therefore needs to install the correct version of tensorflow for greta. This is done from R with the following commands (the same commands are in the main script, but commented out and need only to be run once):

library(reticulate)
use_condaenv('r-reticulate', required = TRUE)
library(greta)
library(greta.gp)
greta::install_tensorflow(method = "conda",
                          version = "1.14.0",
                          extra_packages = "tensorflow-probability==0.7")

Once the user has installed tensorflow, they can run the model from within the script

scripts/main_script_GP.R

which runs the model for a single country or region, specified by the 3-letter iso-code. The script downloads the latest data from Johns Hopkins COVID-19 dataset here and munges the data into the correct format using this function

R/jhu_data_import.R

To run the model at scale, a HPC is used, using the scripts found in

hpc_scripts