Machine learning systems are widely deployed today, but they are unreliable. They can fail – and with catastrophic consequences – on subpopulations of the data, such as particular demographic groups, or when deployed in different environments from what they were trained on. In this talk, I will describe our work towards building reliable machine learning systems that are robust to these failures. First, I will show how we can use influence functions to understand the predictions and failures of existing models through the lens of their training data. Second, I will discuss the use of distributionally robust optimization to train models that perform well across all subpopulations. Third, I will describe WILDS – a benchmark of in-the-wild distribution shifts spanning applications such as pathology, conservation, remote sensing, and drug discovery – and show how current state-of-the-art methods, which perform well on synthetic distribution shifts, still fail to be robust on these real-world shifts. Finally, I will describe our work on building more reliable COVID-19 models, using anonymized cellphone mobility data, to inform public health policy; this is a challenging application as the underlying environment is often changing and there is substantial heterogeneity across demographic subpopulations.
Bio: Pang Wei Koh is a PhD student at Stanford, advised by Percy Liang. He studies the theory and practice of building reliable machine learning systems. His research has been published in Nature and Cell, featured in media outlets such as The New York Times and The Washington Post, and recognized by best paper awards at ICML and KDD, a Meta Research PhD fellowship, and the Kennedy Prize for best honors thesis at Stanford. Prior to his PhD, he was the 3rd employee and Director of Partnerships at Coursera.
This talk will be recorded and live-streamed at https://mediacentrallive.princeton.edu/