03-04
Rethinking Data Use in Large Language Models

Image
Photo of Sewon Min

Large language models (LMs) such as ChatGPT have revolutionized natural language processing and artificial intelligence more broadly. In this talk, I will discuss my research on understanding and advancing these models, centered around how they use the very large text corpora they are trained on. First, I will describe our efforts to understand how these models learn to perform new tasks after training, demonstrating that their so-called in context learning capabilities are almost entirely determined by what they learn from the training data. Next, I will introduce a new class of LMs—nonparametric LMs—that repurpose this training data as a data store from which they retrieve information for improved accuracy and updatability. I will describe my work on establishing the foundations of such models, including one of the first broadly used neural retrieval models and an approach that simplifies a traditional, two-stage pipeline into one. I will also discuss how nonparametric models open up new avenues for responsible data use, e.g., by segregating permissive and copyrighted text and using them differently. Finally, I will envision the next generation of LMs we should build, focusing on efficient scaling, improved factuality, and decentralization.

Bio: Sewon Min is a Ph.D. candidate in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. Her research focuses on language models (LMs): studying the science of LMs, and designing new model classes and learning methods that make LMs more performant and flexible. She also studies LMs in information-seeking, legal, and privacy contexts. She is a co-organizer of multiple tutorials and workshops, including most recently at ACL 2023 on Retrieval-based Language Models and Applications and upcoming at ICLR 2024 on Mathematical and Empirical Understanding of Foundation Models. She won a paper award at ACL 2023, received a J.P. Morgan Fellowship, and was named an EECS rising star in 2022.


To request accommodations for a disability please contact Emily Lawrence, emilyl@cs.princeton.edu, at least one week prior to the event.

Date and Time
Monday March 4, 2024 12:30pm - 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Host
Danqi Chen

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List