Dmitry Paramonov will present his FPO "Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding" on May 7th, 2024 at 11am in Friend 006.
His committee is as follows:
Examiners: Gillat Kol (adviser), Ran Raz, Mark Braverman
Readers: Huacheng Yu and Matt Weinberg
Title: Handling Data Too Large To Handle: On Multi-pass Streaming and Interactive Coding
Abstract:
Over the last decades, the world has become increasingly more information-centric and massive
amounts of data, potentially distributed between many different sources, are being processed all the
time. In this thesis, I consider two mechanisms for coping with big data and the distributed nature
of timely tasks.
In Part I, I showcase my work on the streaming setting, where the input to the algorithm is
given as a stream of elements. The algorithm’s goal is to compute a value that depends on the
stream while only utilizing memory that is much smaller than the entire stream. My work in this
field focuses on proving that various fundamental graph problems essentially require the streaming
algorithm to store the entire graph, even if it is allowed to make several passes through the given
stream of edges.
In Part II, I consider error-correcting codes for distributed, interactive settings. Classical error-
correcting codes assume that a sender who has all the information wishes to send it to a receiver over
a noisy channel. However, in many modern, big data applications, the information is distributed
amongst many parties that communicate back-and-forth to compute a value that depends on all
their inputs. My work examines the noise resilience of various such settings. For some models,
we can design error-correcting protocols that allow the encoding of every noiseless protocol by a
noise-resilient protocol with low overhead, whereas for other models, it can be shown that this task
is impossible.
While these two topics appear greatly unrelated, and almost orthogonal to one another, the tools
used to prove results in both turn out to be remarkably similar, with many standard problems and
information theory lemmas being a critical part of both.