Availability, Scalability and Cost-Effectiveness of Cluster-Based Internet Infrastructures (Thesis)

Report ID: TR-633-01
Author: Ji, Minwen
Date: 2001-02-00
Pages: 118
Download Formats: |PDF|
Abstract:

Clusters of commodity computers are a cost-effective hardware platform for large-scale Internet services. Availability and scalability are major concerns in the design of infrastructures for such services. My dissertation examines the opportunities in the data storage systems for improving the availability and scalability of cluster-based Internet infrastructures at a low cost. The goal of availability is to maximize the percentage of client requests that succeed despite the failure of one or more servers in the cluster. The goal of scalability is to efficiently scale the server throughput with the cluster size. My basic approach is to investigate the data and request distribution strategies across nodes in the cluster, i.e. how to partition and replicate data on disk or in memory and how to direct requests to the right partitions in order to achieve high availability and scalability.

Maintaining availability in the face of failures is a critical requirement for Internet services. Existing approaches in cluster-based data storage rely on redundancy to survive a small number of failures, but the system becomes largely unavailable if more failures occur. I study a failure isolation approach in which each server in the cluster can deliver data to clients independently of the failures of other servers. This approach is complementary to existing redundancy-based methods: redundancy can mask the first few failures, and failure isolation can take over and maintain availability for the majority of clients if more failures occur.

The ability to achieve high quality of service with minimal committed resources allows savings in many aspects including equipment cost, power consumption, and administration effort for Internet services. I study how to improve the price-performance ratio of Internet application servers by efficiently managing a cluster of in-memory databases as the cache for dynamic content. I observe that a good management strategy could be found at least for certain applications despite the challenges of dynamic content. It strives to maximize effective cache capacity and minimize synchronization cost. It is light-weighted and adapts dynamically to the changes in loads and access patterns.