On the Fault Tolerance of Some Popular Bounded-Degree Networks

Report ID: TR-385-92
Author: Maggs, Bruce / Leighton, Tom / Sitaraman, Ramesh Kumar
Date: 1992-09-00
Pages: 42
Download Formats: |Postscript|
Abstract:

In this paper, we analyze the ability of several bounded-degree networks that are commonly used for parallel computation to tolerate faults. Among other things, we show that an $N$-node butterfly containing $N^{1-epsilon}$ worst-case faults (for any constant $epsilon > 0$) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proved for the shuffle-exchange graph. Hence, these networks become the first connected bounded-degree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance. We also show that an $N$-node butterfly whose nodes fail with some constant probability $p$ can emulate a fault-free version of itself with a slowdown of $2^{O(log^* N)}$, which is a very slowly increasing function of $N$. The proofs of these results combine the technique of redundant computation with new algorithms for (packet) routing around faults in hypercubic networks. Techniques for reconfiguring hypercubic networks around faults that do not rely on redundant computation are also presented. These techniques are less tolerant of faults but are more widely applicable since they can be used with other networks such as binary trees and meshes of trees.