Recent Publications

simChef is an R package that empowers data science practitioners to rapidly plan, carry out, and summarize statistical simulation …

Individualized treatment rules, cornerstones of precision medicine, inform patient treatment decisions with the goal of optimizing …

The widespread availability of high-dimensional biological data has made the simultaneous screening of many biological characteristics …

The covariance matrix plays a fundamental role in many modern exploratory and inferential statistical procedures, including …

An endeavor central to precision medicine is predictive biomarker discovery; they define patient subpopulations which stand to benefit …



unihtee implements nonparametric inference procedures for treatment effect modification variable importance parameters. These variable importance parameters reflect individual confounders’ capacity for treatment effect modification and are suited for high-dimensional data with potentially complex correlation structures.


uniCATE implements semiparametric inference procedures for variable importance parameters that assess biomarkers’ treatment effect modification capabilities in high-dimensional clinical trials.


The cvCovEst R package implements a data-adaptive framework for asymptotically optimal covariance matrix estimator selection in high dimensions.


The scPCA R package implements sparse contrastive PCA, a variant of PCA that extracts sparse, stable, and interpretable signal.


University of California, Berkeley

  • The Foundations of Data Science, Data 8 (Summer ‘20) – Instructor
  • Statistical Analysis of Categorical Data, PBHLTH 241 (Spring ‘20) – Graduate Student Instructor
  • Principles and Techniques of Data Science, DATA 100 (Spring ‘19, Fall ‘19) – Graduate Student Instructor
  • Introduction to Probability and Statistics in Biology and Public Health, PBHLTH 142 (Fall ‘18) – Graduate Student Instructor

Selected Experiences



Analysis Group

Aug 2023 – Present Montreal, QC
Provide data-driven strategies for clients in the life sciences industry across all phases of product development and commercialization.

Associate Summer Intern

Analysis Group

Jun 2022 – Aug 2022 Montreal, QC
Wrote statistical analysis plans, performed analyses and prepared results for the firm’s clients as a member of the Health Economics and Outcomes Research team. Developed a proprietary R software package for time-to-event data analysis.

Data Science Intern

Genentech / Roche

May 2021 – May 2023 Remote
Developed assumption-lean statistical inference methods for treatment effect modifier discovery and benchmarked them against existing procedures. Performed a comprehensive simulation study of mixed models for repeated measures software. Implemented visualization functions in R for effective summaries of clinical trial data. Compiled and compared designs for phase I oncology trials and produced an educational document for internal use.

Graduate Student Researcher

University of California, Berkeley Superfund

Aug 2020 – Jun 2022 Berkeley, CA, United States
Analyze epigenetic data collected by the organization’s environmental health scientists and epidemiologists to better understand the effects of chemical exposures on human health. This is accomplished through the development and application of novel statistical methods.

Instructor, Data 8: The Foundations of Data Science

University of California, Berkeley

May 2020 – Aug 2020 Berkeley, CA, United States
Taught foundational concepts in statistics and computer science to over 400 students while managing a team of teaching assistants.

Graduate Student Intern

Sutter Health - Research, Development and Dissemination

Jun 2019 – Aug 2019 Walnut Creek, CA, United States
Developed a statistical learning pipeline to evaluate a patient’s risk of becoming septic during their hospital visit.