Integrated Parellel Prefetching and Caching

Report ID: TR-502-95
Author: Cao, Pei
Date: 1995-12-00
Pages: 18
Download Formats: |Postscript|
Abstract:

Recently there have been a lot of interests in prefetching on parallel disks. Prefetching is considered an important technique to exploit the parallelism on multiple disks for serial applications. Studies have also shown that for optimal performance, it is important to integrate prefetching and caching. In this paper, we study integrated parallel prefetching and caching strategies for multiple disks. We present two algorithms, {em regular aggressive} and {em reverse aggressive}, and give evidence that {em reverse aggressive} is close to optimal. Using trace-driven simulation with a collection of file access traces, we evaluated these algorithms under a variety of data placement policies including striping and random. Our results show that both algorithms can achieve near linear speedup when the load is distributed evenly on disks, and {em reverse aggressive} performs well even when the data placement policy distributes the load unevenly [introduce load unbalance?]. In particular, the results show that when prefetching is done well, the striping data placement strategy perform close to replicating data across all of the disks. We also evaluate four online variations of the algorithms and show that the online algorithms perform well even with moderate advance knowledge of future file accesses.