Application-Controlled File Caching and Prefetching (Thesis)
Report ID: TR-522-96Author: Cao, Pei
Date: 1996-05-00
Pages: 153
Download Formats: |Postscript|
Abstract:
As disk performance continues to lag behind microprocessors and memory systems, the file system is increasingly becoming the bottleneck for many applications. This dissertation demonstrates that with appropriate mechanisms and policies, application-controlled file caching and prefetching can significantly improve the file I/O performance of applications in both single and multiple process cases. In traditional file systems, the kernel controls file cache replacement and prefetching using fixed policies, with no input from user processes. For many applications, this has resulted in poor utilization of the file cache and little overlapping between I/O and computation. The challenges are to design a scheme that allows applications to control the management of their file cache, and to design algorithms for the kernel to coordinate the use of shared resources so that the performance of the whole system is guaranteed to improve. This dissertation proposes two-level file cache management: the kernel allocates physical pages to individual applications, and each application is responsible for deciding how to use its physical pages. The dissertation addresses three issues: a global allocation policy that allows applications to control their own cache replacement while maintaining fair allocation of cache blocks among processes, integrated algorithms for caching and prefetching, and a low-overhead mechanism to implement the interactions between user processes and the kernel. A prototype file system, ACFS, is implemented to experiment with application-controlled file caching and prefetching on a suite of I/O intensive applications. Experiments show that application-controlled file caching and prefetching combined with disk scheduling significantly improves the file I/O performance for applications: individual applications' running times are reduced by 3% to 49% (average 26%), and multi-process workloads' running times are reduced by 5% to 76% (average 32%). Each technique provides substantial performance benefits: application-controlled file caching reduces the number of disk I/Os, carefully integrated caching and prefetching increase the overlap between CPU computation and disk accesses, and disk scheduling combined with prefetching reduces the average disk access latency. The combination of all three techniques provides the best performance.