Recent Publications

The covariance matrix plays a fundamental role in many modern exploratory and inferential statistical procedures, including …

An endeavor central to precision medicine is predictive biomarker discovery; they define patient subpopulations which stand to benefit …

Adenoid cystic carcinoma (ACC) is the second most common cancer type arising from the salivary gland. The frequent occurrence of …

Ionizing radiation is a well-appreciated health risk, precipitant of DNA damage, and contributes to DNA methylation variability. …



uniCATE implements semiparametric inference procedures for variable importance parameters that assess biomarkers’ treatment effect modification capabilities in high-dimensional clinical trials.


The cvCovEst R package implements a data-adaptive framework for asymptotically optimal covariance matrix estimator selection in high dimensions.


The scPCA R package implements sparse contrastive PCA, a variant of PCA that extracts sparse, stable, and interpretable signal.


University of California, Berkeley

  • The Foundations of Data Science, Data 8 (Summer ‘20) – Instructor
  • Statistical Analysis of Categorical Data, PBHLTH 241 (Spring ‘20) – Graduate Student Instructor
  • Principles and Techniques of Data Science, DATA 100 (Spring ‘19, Fall ‘19) – Graduate Student Instructor
  • Introduction to Probability and Statistics in Biology and Public Health, PBHLTH 142 (Fall ‘18) – Graduate Student Instructor

Selected Experiences


Associate Summer Intern

Analysis Group

Jun 2022 – Aug 2022 Montreal, QC
Wrote statistical analysis plans, performed analyses and prepared results for the firm’s clients as a member of the Health Economics and Outcomes Research team. Developed a proprietary R software package for time-to-event data analysis.

Data Science Intern

Genentech / Roche

May 2021 – Present Remote
Develop assumption-lean statistical inference methods for treatment effect modifier discovery, and benchmark them against existing procedures. Implement visualization functions in R for efficient exploration of phase I clinical trial data. Compile and compare designs for phase I oncology trials, and summarize them in an educational document for internal use.

Graduate Student Researcher

University of California, Berkeley Superfund

Aug 2020 – Present Berkeley, CA, United States
Analyze data collected by the organization’s environmental health scientists and epidemiologists to better understand the effects of chemical exposures on human health. This is accomplished through the development and application of novel statistical methods.

Instructor, Data 8: The Foundations of Data Science

University of California, Berkeley

May 2020 – Aug 2020 Berkeley, CA, United States
Taught foundational concepts in statistics and computer science to over 400 students while managing a team of teaching assistants.

Graduate Student Intern

Sutter Health - Research, Development and Dissemination

Jun 2019 – Aug 2019 Walnut Creek, CA, United States
Developed a statistical learning pipeline to evaluate a patient’s risk of becoming septic during their hospital visit.