CMU-CS-13-101 Computer Science Department School of Computer Science, Carnegie Mellon University
Priority Update as a Parallel Primitive Julian Shun, Guy E. Blelloch, Jeremy T. Fineman*, Phillip B. Gibbons** February 2013
Memory contention can be a serious performance bottleneck in concurrent programs on sharedmemory multicore architectures. Having all threads write to a small set of shared locations, for example, can lead to orders of magnitude loss in performance relative to all threads writing to distinct locations, or even relative to a single thread doing all the writes. Shared write access, however, can be very useful in parallel algorithms, concurrent data structures, and protocols for communicating among threads. In this paper we study the "priority update" operation as a useful primitive for limiting write contention in parallel and concurrent programs. A priority update takes as arguments a memory location, a new value, and a comparison function >p that enforces a partial order over values. The operation atomically compares the new value with the current value in the memory location, and writes the new value only if it has higher priority according to >p. On the implementation side, we show that if implemented appropriately, priority updates greatly reduce memory contention over standard writes or other atomic operations when locations have a high degree of sharing. This is shown both experimentally and with theoretical justification. On the application side, we describe several uses of priority updates for implementing parallel algorithms and concurrent data structures, often in a way that is deterministic, guarantees progress, and avoids serial bottlenecks. We present experimental results showing that a variety of such algorithms and data structures perform well under high degrees of sharing. Given the results, we believe the priority update operation serves as a useful parallel primitive and good programming abstraction as (1) the user largely need not worry about the degree of sharing, (2) it can be used to avoid non-determinism since, in the common case when >p is a total order, priority updates commute, and (3) it has many applications to programs using shared data.
27 pages
*Georgetown University
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |