SkipIndex: Towards a Scalable Peer-to-Peer Index Service for High Dimensional Data
Report ID: TR-703-04Author: Zhang, Chi / Krishnamurthy, Arvind / Wang, Randolph Y.
Date: 2004-05-00
Pages: 16
Download Formats: |PDF|
Abstract:
Indexing of high-dimensional data is essential for building applications such as multimedia retrieval, data mining, and spatial databases. Traditional index structures rely on centralized processing. This approach does not scale with the rapidly increasing amount of application data available on massively distributed systems like the Internet.
In this paper, we propose a distributed high-dimensional index structure based on peer-to-peer overlay routing. A new routing scheme is used to lookup data keys in the distributed index, which guarantees logarithmic lookup and maintenance cost, even in the face of skewed datasets. We propose a novel nearest neighbor (NN) query scheme that can substantially reduce search cost by sacrificing a small amount of precision. We propose a load-balancing mechanism that partitions the high dimensional search space in a balanced manner. We then analyze the performance of our proposed using a variety of metrics with simulation as well as a functional PlanetLab implementation.