Studies in Algorithms (Thesis)

Report ID: TR-763-06
Author: Ailon, Nir
Date: 2006-08-00
Pages: 151
Download Formats: |PDF|
Abstract:

This work is comprised of three separate parts: (1) Lower bounds for linear degeneracy testing (based on joint work with Bernard Chazelle); (2) Aggregating inconsistent information (based on joint work with Moses Charikar and Alantha Newman); and (3) The fast Johnson-Lindenstrauss transform and approximate nearest neighbor searching (based on joint work with Bernard Chazelle). The first part discusses a fundamental computational geometric problem called rSUM: given n real numbers, do any r of them sum up to 0? It is the simplest case of a more general family of problems called degeneracy testing. The computational model assumed is the linear decision tree. We extend and improve a seminal result by Erickson from 1999. The second part studies optimization algorithms designed for integrating information coming from different sources. This framework includes the well-known problem of voting from the old theory of social choice. Recently, the computational aspects of these problems have been studied, and several hardness results were proved. We design improved algorithms for two important problems known as rank aggregation and consensus clustering. In our analysis we prove new results on optimization over binary relations (in particular, order and equivalence relations). The third part revisits the computational aspects of a well-known lemma by Johnson and Lindenstrauss from the mid 80's, stating that any finite subset of a Euclidean space can be almost isometrically embedded in a space of dimension at most logarithmic in the size of the subset by projecting onto a random subspace. This technique improves many algorithms suffering from running time and/or space depending heavily on the dimensionality. In this work we define a new distribution on linear transformations with a significant computational improvement. We call it the Fast Johnson-Lindenstrauss Transform (FJLT), and show how to apply it to improve nearest neighbor searching in Euclidean space.