02-12
Coordination Avoidance in Distributed Databases

[[{"fid":"523","view_mode":"embedded_left","fields":{"format":"embedded_left","field_file_image_alt_text[und][0][value]":"Peter Bailis","field_file_image_title_text[und][0][value]":"","field_file_caption_credit[und][0][value]":"%3Cp%3EPeter%20Bailis%3C%2Fp%3E%0A","field_file_caption_credit[und][0][format]":"full_html"},"type":"media","attributes":{"alt":"Peter Bailis","height":332,"width":250,"class":"media-element file-embedded-left"},"link_text":null}]]The rise of Internet-scale geo-replicated services has led to considerable upheaval in the design of modern data management systems. Namely, given the availability, latency, and throughput penalties associated with classic mechanisms such as serializable transactions, a broad class of systems (e.g., "NoSQL") has sought weaker alternatives that reduce the use of expensive coordination during system operation, often at the cost of application integrity. When can we safely forego the cost of this expensive coordination, and when must we pay the price?

In this talk, I will discuss the potential for coordination avoidance -- the use of as little coordination as possible while still ensuring application integrity -- in several modern data-intensive domains. Specifically, I will demonstrate how to leverage the semantic requirements of applications in data serving, transaction processing, and statistical analytics to enable more efficient distributed algorithms and system designs. The prototype systems I have built demonstrate order-of-magnitude speedups compared to their traditional, coordinated counterparts on a variety of tasks, including referential integrity and index maintenance, transaction execution under common isolation models, and asynchronous convex optimization. I will also discuss our experiences studying and optimizing a range of open source applications and systems, which exhibit similar results.

Peter Bailis is a Ph.D. candidate at UC Berkeley working in databases and distributed systems. As part of his dissertation work, he has studied and built high performance distributed data management systems for large scale transaction processing, data serving, and statistical analytics in the AMPLab and BOOM projects under the advisement of Joseph M. Hellerstein, Ali Ghodsi, and Ion Stoica. He is the recipient of the NSF Graduate Research Fellowship, the Berkeley Fellowship for Graduate Study, and best-of-conference citations for research appearing in both SIGMOD and VLDB. He received his A.B. in Computer Science from Harvard College in 2011, where he also received the CRA Outstanding Undergraduate Researcher Award.

Date and Time

Thursday February 12, 2015 12:30pm - 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Department Colloquium Series

Speaker

Peter Bailis, from University of California, Berkeley

Host

Michael Freedman

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

02-12 Coordination Avoidance in Distributed Databases

02-12
Coordination Avoidance in Distributed Databases