Tolerating Slowdowns in Replicated State Machines using Copilots: Pseudocode and Proof of Correctness

Report ID: TR-004-20
Author: Lloyd, Wyatt / Sen, Siddhartha / Ngo, Khiem
Date: 2021-08-31
Pages: 23
Download Formats: |PDF|
Abstract:

Abstract: This technical report presents the pseudocode and a proof of correctness for Copilot replication, a new consensus protocol that achieves 1-slowdown-tolerance: it delivers normal latency despite the slowdown of any 1 replica. Copilot achieves this by using two distinguished replicas—the pilot and copilot—to proactively add redundancy to all stages of processing a client’s command. We give an overview of the Copilot protocol and then present its pseudocode. We also provide a proof of correctness in the asynchronous crash failure model, showing that: Copilot requires 2f + 1 replicas to tolerate at most f crash failures, while guaranteeing safety (linearizability) despite any number of failures, and liveness as long as a majority of replicas communicate in a timely manner. These guarantees are the same as those of Multi-Paxos and EPaxos, which Copilot draws inspiration from. However, Copilot is the only protocol that provides 1-slowdown-tolerance in the presence of any one slow replica. For a detailed discussion of Copilot’s design, optimizations, and evaluation, please refer to the conference paper.