Data & code to reproduce paper:
Staub K*, Panczak R*, Matthes KL, Floris J, Berlin C, Junker C, Weitkunat R, Mamelund SE, Zwahlen M*, Riou J* (2021) Historically high excess mortality during the COVID-19 pandemic in Switzerland, Sweden and Spain. Annals of Internal Medicine. [Epub ahead of print 1 February 2022]. doi:10.7326/M21-3824
* Equal contribution.
Code version 2.1.
Earlier version of preprint:
Staub K*, Panczak R*, Matthes KL, Floris J, Berlin C, Junker C, Weitkunat R, Mamelund SE, Egger M, Zwahlen M*, Riou J* (2021) Pandemic excess mortality in Spain, Sweden, and Switzerland during the COVID-19 pandemic in 2020 was at its highest since 1918. medRxiv 2021.08.12.21261825. doi: https://doi.org/10.1101/2021.08.12.21261825
* Equal contribution.
Code version 1.0.
Details of the data sources, data preparation and main as well as supplementary analyses are available on the project pages.
Visual summary of repo content here.
.
+-- analyses
+-- data
+-- data-raw
+-- docs
+-- paper
| \-- supplementary
+-- R
+-- stan
\-- ubelix
This folder contains set of literate programming R scripts written either in R or inR
flavoured markdown
. The files are numbered sequentially in order in which they should be executed. The key parts of this folder include:
01_data-prepare.Rmd
this script download, prepares data from The Human Mortality Database and also ancillary, country-specific datasets containing updates of monthly death counts, yearly age-specific death counts and population figures from statistical agencies. The output of these analyses is presented here.02_BfS-data-weekly.Rmd
this script prepares and compares weekly data on death counts for Switzerland from two sources: Short-term Mortality Fluctuations and Swiss Federal Statistical Office03_joint-model-cmdstan.R
this script runs the main analyses of the paper; it sources functions fromR
folder which in turn source and compile Stan programs fromstan
directory. One main model is fitted and then three sensitivity analyses performed for each country and analysis year combination. The script can be run from command line with a parameter in range1-3
to obtain country specific results - examples of such calls can be found inubelix
folder scripts; results will be stored in folderoutputs_YYYY-MM-DD
indata directory. Note that depending on the architecture these analyses might take many hours or days to finish and we strongly recommend running them in appropriate computing environment - server, cloud or HPC. -
04_joint-model-boot.R` this script performs additional sensitivity analysis; due to lower computational demands of this analyses they can be performed reasonably fast on a decently equipped laptop.05_outputs.Rmd
this script combines results obtained by running scripts03
and04
and prepares outputs (figures and tables) for the paper and supplement. The output of these analyses is presented here06_comparison_week_month.Rmd
this script performs additional sensitivity analysis comparing estimates from weekly models to monthly models.
Raw, unprocessed data are stored and accessed from the data-raw
folder; prepared data are stored in data
. Both of these folders contain subdirectories with names of the data source, for instance mortality_org
folder stores data from the HMD whereas BfS
folder stores data from Swiss Federal Statistical Office [Bundesamt fΓΌr Statistik (BfS)]. outputs_YYYY-MM-DD
folders in data
directory hold results of the analyses runs and can be reused in order to save the computational time. Three files in data
directory with names starting with results_
store outputs of the main models that were used for reporting in main figures and tables of the manuscript.
This folder stores html
files generated from the scripts in the analyses
folder, with the names equivalent to the scripts that are used to generate them. These files are used on the project pages.
This folder stores figures and tables presented in the paper and in the supplementary materials.
This folder stores various scripts used for data manipulation or analyses by the scripts from the analyses
folder. Most important scripts include:
fn_age_serfling_nb_cmdstan.R
used for the main analyses (age adjusted, negative binomial model implemented in Stan)fn_global_serfling_nb_cmdstan.R
used for the sensitivity analyses (unadjusted, negative binomial model implemented in Stan)fn_global_serfling.R
andfn_boot_pi.R
used for sensitivity analyses (unadjusted Poisson model)
Scripts from R
script above make calls to Stan programs stored in this directory. Names of this programs are matching the scripts calling them (age_serfling_nb.stan
&
global_serfling_nb.stan
). Both of the scripts have versions with 21
in the name of the file which are special cases of each script designed to handle incomplete 2021 data. Note that the programs will be compiled on their first run and will create additional files with the same name and extensions depending on the platform (for instance .exe
on Windows OS; shell scripts on *NIX systems).
This folder stores example OpenPBS scripts that can be used to submit the analyses
scripts to HPC cluster.
This folder should store username & password for accessing Human Mortality Database in plain text files named username.txt
& password.txt
.
Details on the version of R software and packages used are presented at the bottom of the output documents stored in docs
folder. Script 03_joint-model-cmdstan.R
was run on HPC cluster using R
version 4.1.0 with packages versions cmdstanr
0.4.0 & cmdstan
2.27.0.
Yearly differences
Monthly differences
Four pandemics
Age effects