Skip to content

Tutorial for a target trial emulation with a time-varying exposure, time-dependent confounding, and a time-to-event outcome. Estimation is Sequentially Doubly Robust (SDR). Application is corticosteroids effect on mortality.

Notifications You must be signed in to change notification settings

suarezveirano/steroids-trial-emulation

 
 

Repository files navigation

Steroids Target Trial Emulation Tutorial

This repository was created to help other analysts run an analysis similar to Hoffman et. al's MedRxiv pre-print Corticosteroids in COVID-19: Optimizing Observational Research Through Target Trial Emulations (2022).

This research was presented at the American Causal Inference Conference on May 24, 2022; slide deck available here.

Code Contents

  • analysis.R: a script to run a similar analysis (pared to improve computational time) with demo data of n=2000 patients

  • report_results.R: a script to clean the output of analysis.R

  • trt_timeline_viz.R: a script to create a patient treatment timeline similar to Supplemental Figure 1

Demo Data

The primary analysis is run using the open source R package lmtp (please note we use the sl3-compatible branch to improve computational speed). A helpful vignette is available here. We provide demo data in the data folder in combination with this visual representation of the required data format:

The required data structure for a longitudinal time-to-event analysis is wide (one row per subject), with one column per time point per variable (treatment, censoring indicator, outcome indicator, time-varying covariate). The exception to this is baseline variables, which by definition do not have multiple time points.

A few notes to help with pre-processing:

  • Subjects should by default have a "censoring" indicator of 1 to indicate they are observed at the next time point. If lost to follow up, this indicator becomes 0. The censoring indicator should be 1 if the subject experiences the event at the next time point, and NA for the following time points.

  • If a patient experiences the event, their outcome variables should be 1 for all time points until the end of the study.

  • If a patient has a censoring indicator of 0 (meaning they are lost to follow-up starting at the next time point), all columns corresponding to those future time points should have values of NA.

Analysis Specifications

Super learner libraries

The code to make super learner libraries (via sl3) used in the paper's analysis is in analysis.R, however, all but LASSO and mean are commented out to improve computational time. Learners were the same for intervention and outcome mechanisms. We specified 10 folds for superlearner cross-validation. This is set to a value of .SL_folds=5 in our demo analysis code for computational time purposes.

Time-dependent confounding assumption

We used a Markov assumption of 2, meaning a patient's time-dependent confounders for the previous two time periods (48 hour windows) were sufficient to capture confounding for the next time point's mechanism. This was a decision stemming from clinical knowledge (laboratory results are ordered in 24 or 48 hour intervals). This is set to a value of k=1 in our demo analysis code for computational time purposes.

Cross-fitting

We employed 10-fold cross-fitting on our SDR estimator. This is set to a value of folds=5 in our demo analysis code for computational time purposes.

Treatment timelines

A figure in the Supplemental Materials shows a random sample of 50 patients' treatment timelines. A blog post to aid other analysts in creating their own treatment timelines can be found here.

.

About

Tutorial for a target trial emulation with a time-varying exposure, time-dependent confounding, and a time-to-event outcome. Estimation is Sequentially Doubly Robust (SDR). Application is corticosteroids effect on mortality.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%