CS

A Critical Analysis of Multigrid Methods on Massively Parallel Computers

Report ID: TR-448-94

Author: Matheson, Lesley R. / Tarjan, Robert E.

Date: 1994-02-00

Pages: 14

Download Formats: |Postscript|

Abstract:

The hierarchical nature of multigrid algorithms leaves domain parallel strategies with a deficiency of parallelism as the computation moves to coarser and coarser grids. To introduce more parallelism several strategies have been designed to project the original problem space into non-interfering subspaces, allowing all grids to relax concurrently. Our objective is to understand the potential efficiency of standard and concurrent multigrid algorithms on existing and proposed massively parallel machines. We study model problems on simple domains discretized with finite difference techniques on block structured meshes in two and three dimensions with up to $10^{6}$ and $10^{9}$ points, respectively. Performance of the standard domain parallel V and F cycle schemes is compared to several proposed concurrent algorithms. The multigrid strategies are studied in several models of parallel computation, designed to reflect both the characteristics of existing machines and those of the proposed next generation of multicomputers. These models include a SIMD fine-grain model which contains a large number $(10^{4} - 10^{6})$ of small (bit-serial) processors as well as a SPMD medium-grain model with a more modest number (256-16,384) of powerful (single chip) processors interconnected through a single stage or multistage communication network. Our analysis suggests that obtaining acceptable levels of performance requires optimization techniques which address the salient characteristics of each architectural class of massively parallel computers. With the appropriate optimization techniques, a comparison indicates that the F-cycle is potentially more efficient than the V-cycle despite the relative increase in coarse grid activity. In addition, the analysis suggests that subspace parallelism is too expensive to be practical, except under a very limited set of conditions, on either class of massively parallel computers.