12-04
AI Models for Edge Computing: Hardware-aware Optimizations for Efficiency

As artificial intelligence (AI) transforms various industries, state-of-the-art models have exploded in size and capability. The growth in AI model complexity is rapidly outstripping hardware evolution, making the deployment of these models on edge devices remain challenging. To enable advanced AI locally, models must be optimized for fitting into the hardware constraints. In this presentation, we will first discuss how computing hardware designs impact the effectiveness of commonly used AI model optimizations for efficiency, including techniques like quantization and pruning. Additionally, we will present several methods, such as hardware-aware quantization and structured pruning, to demonstrate the significance of software/hardware co-design. We will also demonstrate how these methods can be understood via a straightforward theoretical framework, facilitating their seamless integration in practical applications and their straightforward extension to distributed edge computing. At the conclusion of our presentation, we will share our insights and vision for achieving efficient and robust AI at the edge. 

Bio: Yiran Chen received his B.S. (1998) and M.S. (2001) degrees from Tsinghua University and his Ph.D. (2005) from Purdue University. After spending five years in the industry, he joined the University of Pittsburgh in 2010 as an Assistant Professor and was promoted to Associate Professor with tenure in 2014, holding the Bicentennial Alumni Faculty Fellow position. He currently serves as the John Cocke Distinguished Professor of Electrical and Computer Engineering at Duke University. He is also the director of the NSF AI Institute for Edge Computing Leveraging Next-generation Networks (Athena), the NSF Industry-University Cooperative Research Center (IUCRC) for Alternative Sustainable and Intelligent Computing (ASIC), and the co-director of the Duke Center for Computational Evolutionary Intelligence (DCEI). His group's research focuses on new memory and storage systems, machine learning and neuromorphic computing, and mobile computing systems. Dr. Chen has published one book, more than 600 technical publications, and has been granted 96 US patents. He has received 11 Ten-Year Retrospective Influential Paper Awards, Outstanding Paper Awards, Best Paper Awards, and Best Student Paper Awards, as well as 2 best poster awards and 15 best paper nominations from various international journals, conferences, and workshops. He has been honored with numerous awards for his technical contributions and professional services, including the IEEE CASS Charles A. Desoer Technical Achievement Award and the IEEE Computer Society Edward J. McCluskey Technical Achievement Award. He has been a distinguished lecturer for IEEE CEDA and CAS, is a Fellow of the AAAS, ACM, and IEEE, and currently serves as the chair of ACM SIGDA and the Editor-in-Chief of the IEEE Circuits and Systems Magazine. He is a founding member of the steering committee of the Academic Alliance on AI Policy (AAAIP).


To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu at least one week prior to the event.
This talk will be recorded and live streamed via Zoom.  Webinar registration here.

Date and Time
Monday December 4, 2023 12:30pm - 1:30pm
Location
Computer Science Small Auditorium (Room 105)
Speaker
Host
Kai Li

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List