CMU-CS-11-128
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-11-128

Efficient Parallel Approximation Algorithms

Kanat Tangwongsan

August 2011

Ph.D. Thesis

CMU-CS-11-128.pdf


Keywords: NA

Over the past few decades, advances in approximation algorithms have enabled near-optimal solutions to many important, but computationally hard, problems to be found efficiently. But despite the resurgence of parallel computing in recent years, only a small number of these algorithms have been considered from the standpoint of parallel computation. Furthermore, among those for which parallel algorithms do exist, the algorithms – mostly developed in the late 80s and early 90s – follow a design principle that puts heavy emphasis on achieving polylogarithmic depth, with little regard to work, generally resulting in algorithms that perform substantially more work than their sequential counterparts. As a result, on a modest number of processors, these highly parallel–but "heavy"–algorithms are unlikely to perform competitively with the state-of-the-art sequential algorithms. �is motivates the question: How can one design a parallel approximation algorithm that obtains non-trivial speedups over its sequential counterpart, even on a modest number of processors?

In this thesis, we explore a set of key algorithmic techniques that facilitate the design, analysis, and implementation of a wide variety of efficient parallel approximation algorithms. This includes:

Maximal nearly independent set. A natural generalization of maximal independent set (MIS) solvable in linear work, which leads to linear-work RNC ((1+ ε) ln n)-approximation set cover, (1-1/e-ε)-approximation (prefix optimal) max cover, and (4+ε)-approximation min-sum set cover–and a work-efficient RNC (1:861+&episilon;)-approximation for facility location.
Low-diameter decomposition, low-stret� spanning trees, and subgraphs. This allows us to develop a near-linear work, O(m1/3)-depth algorithm for solving a symmetric diagonally dominant (SDD) linear system Ax = b, with m nonzeros. �e solver leads fast parallel algorithms for max flow, min-cost flow, (spectral) sparsifier, etc.
Probabilistic tree embeddings. An RNC O(n2 log n)-work algorithm for probabilistic tree embeddings with expected stretch O(log n), independent of the aspect ratio of the input metric. This is a parallel version of Fakcharoenphol et. al.'s algorithm, providing a building block for algorithms for k-median and buy-at-bulk network.
Hierarchical diagonal blocking. A sparse matrix representation that exploits the small separators property found in many real-world matrices. We also develop a low-depth parallel algorithm for the representation, which achieves substantial speedups over existing SpMV code.

200 pages



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]