Skip to content
Miao Qi edited this page Jun 18, 2021 · 4 revisions

Welcome to the Clinical Trial Representation wiki!

This is a supplement of the paper "Quantifying Representativeness in Randomized Clinical Trials using Machine Learning Fairness Metrics".

Note: This supplement is done by students and staff of The Rensselaer Institute for Data Exploration and Applications at Rensselaer Polytechnic Institute.

Objective

Objective We formulate population representativeness of randomized clinical trials (RCTs) as a machine learning (ML) fairness problem, derive new representation metrics, and deploy them in visualization tools which help users identify subpopulations that are underrepresented in RCT cohorts with respect to national, community-based or health system target populations.

Materials and Methods

We represent RCT cohort enrollment as random binary classification fairness problems, and then show how ML fairness metrics based on enrollment fraction can be efficiently calculated using easily computed rates of subpopulations in RCT cohorts and target populations. We propose standardized versions of these metrics and deploy them in an interactive tool to analyze three RCTs with respect to type-2 diabetes and hypertension target populations in the National Health and Nutrition Examination Survey (NHANES).

Results

We demonstrate how the proposed metrics and associated statistics enable users to rapidly examine representativeness of all subpopulations in the RCT defined by a set of categorical traits (e.g., sex, race, ethnicity, smoker status, and blood pressure) with respect to target populations.

Discussion

The normalized metrics provide an intuitive standardized scale for evaluating representation across subgroups, which may have vastly different enrollment fractions and rates in RCT study cohorts. The metrics are beneficial complements to other approaches (e.g., enrollment fractions and GIST) used to identify generalizability and health equity of RCTs.

Conclusion

By quantifying the gaps between RCT and target populations, the proposed methods can support generalizability evaluation of existing RCT cohorts, enrollment target decisions for new RCTs, and monitoring of RCT recruitment, ultimately contributing to more equitable public health outcomes.