Yale School of Management

Research overview

Working at the intersection of healthcare operations and applied statistical learning, my research develops data-driven approaches for inferring the process dynamics of healthcare delivery systems, so that realistic models can be built to generate insights for improving the efficiency of healthcare. The specific areas of healthcare that I have worked on fall into three categories, all of which address questions faced by physicians and policymakers.


The incentive dynamics between Medicare and dialysis providers

End-Stage Renal Disease (ESRD) is chronic failure of the kidneys and dialysis therapy is the only alternative treatment to a kidney transplant. In the US, ESRD disproportionately affects the socio-economically disadvantaged, and those afflicted require dialysis therapy three times a week for sustenance, typically at an outpatient dialysis clinic. Due to the crushing healthcare costs associated with ESRD, all medical expenses of almost all ESRD patients are covered by Medicare’s ESRD program, which spent almost $70,000 per patient in 2011 (US Renal Data System). Meanwhile, dialysis providers face conflicting financial interests when deciding on how much effort to put into treating patients, which may adversely impact patient outcomes and Medicare’s costs. My co-authors and I propose evidence-based strategies for realigning the incentives of providers in order to improve patient outcomes, reduce Medicare expenditures as well as increase provider earnings.


  • Lee and Zenios (2009): Optimal capacity overbooking for the regular treatment of chronic conditions. Operations Research 57:4:852-865.
  • Lee, Chertow, Zenios (2010): Re-exploring differences among for-profit and non-profit dialysis providers. Health Services Research 45:3:633-646.
  • Lee and Zenios (2012): An evidence-based incentive system for Medicare’s End-Stage Renal Disease Program. Management Science 58:6:1092-1105.


Outcomes evaluation of medical interventions/clinical trials

A large part of medical statistics is directed towards determining if the treatment and control groups differ in outcomes. This task is sometimes complicated by a lack of sufficient information on the underlying data generating/sampling process, without which the inferential target cannot be identified. Inspired by the general ideas behind partial identification and robust optimization, we show that the target can nonetheless be sharply bounded using the incomplete information that is available on the processes. The upper or lower bound then facilitates efficient nonparametric evaluation of the efficacy of a new medical treatment.


  • Aronow and Lee (2013): Interval estimation of population means under unknown but bounded probabilities of sample selection. Biometrika 100:1:235-240.
  • Aronow, Green, Lee (2014): Sharp bounds on the variance in randomized experiments. Annals of Statistics 42:3:850-871.


Flows in healthcare queues

In current, on-going work I am investigating the use of patient flow data to uncover the transition dynamics underlying stochastic networks that arise from healthcare settings. One recent project used data from the national renal database to develop customized post-transplant survival forecasts for kidney transplant candidates. These predictions can aid patients in evaluating the benefits of accepting a particular donor kidney. From a queuing perspective, the survival process of a transplant recipient is represented by the transition of a patient from the post-transplant node to an absorbing state, so the transplantee’s survival curve is given by the corresponding transition probability tail distribution. This approach extends for example the statistical models behind Adjuvant! Online, a cancer survival forecasting tool that has gained prominence amongst oncologists and patients for its performance in evaluating the benefits of different adjuvant therapies.