|
CMU-CS-02-193
Computer Science Department
School of Computer Science, Carnegie Mellon University
CMU-CS-02-193
Adventures in Ultra-Small-Space Clustering
Adam Meyerson, Liadan O'Callaghan*, Serge Plotkin*
December 2002
CMU-CS-02-193.ps
CMU-CS-02-193.pdf
Keywords: Algorithms, approximation, clustering, sampling,
k-median, facility location
We give a sampling-based algorithm for the k-Median problem,
with running time where [.....] k is the desired cluster number and
is a confidence parameter. This is the first k-Median algorithm
with fully polynominial running time that is independent of n,
the size of the data set. It gives a solution that is, with high
probability, an O(1) approximation if each cluster in the optimal
solution has [....] points. We give near-matching lower bounds showing that
this assumption about cluster size is necessary. We also present a
related algorithm for finding a clustering that excludes a small number
of outliers.
13 pages
* Department of Computer Science, Stanford University
|