A nonparametric framework for treatment effect modifier discovery in high dimensions

Abstract

Heterogeneous treatment effects are driven by treatment effect modifiers (TEMs), pretreatment covariates that modify the effect of a treatment on an outcome. Current approaches for uncovering these variables are limited to low-dimensional data, data with weakly correlated covariates, or data generated according to parametric processes. We resolve these issues by proposing a framework for defining model-agnostic TEM variable importance parameters (TEM-VIPs), deriving one-step, estimating equation, and targeted maximum likelihood estimators of these parameters, and establishing these estimators’ asymptotic properties. This framework is showcased by defining TEM-VIPs for data-generating processes with continuous, binary, and time-to-event outcomes with binary treatments, and deriving accompanying asymptotically linear estimators. Simulation experiments demonstrate that these estimators’ asymptotic guarantees are approximately achieved in realistic sample sizes in randomized and observational studies alike. This methodology is also applied to gene expression data collected in a clinical trial assessing the effect of a novel therapy on disease-free survival in breast cancer patients. Predicted TEMs have previously been linked to treatment resistance.

Publication
Journal of the Royal Statistical Society Series B: Statistical Methodology