CMU-CS-13-126
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-13-126

Algorithmic Engineering Towards
More Efficient Key-Value Systems

Bin Fan

December 2013

Ph.D. Thesis

CMU-CS-13-126.pdf


Keywords: Memory Efficiency, Hash Table, Bloom Filter, Caching, Load Balancing


Distributed key-value systems have been widely used as elemental components of many Internet-scale services at sites such as Amazon, Facebook and Twitter. This thesis examines a system design approach to scale existing key-value systems, both horizontally and vertically, by carefully engineering and integrating techniques that are grounded in re- cent theory but also informed by underlying architectures and expected workloads in practice. As a case study, we re-design FAWN-KV–a distributed key-value cluster consisting of "wimpy" key-value nodes—to use less memory but achieve higher throughput even in the worst case.

First, to improve the worst-case throughput of a FAWN-KV system, we propose a randomized load balancing scheme that can fully utilize all the nodes regardless of their query distribution. We analytically prove and empirically demonstrate that deploying a very small but ex- tremely fast load balancer at FAWN-KV can effectively prevent uneven or dynamic workloads creating hotspots on individual nodes. Moreover, our analysis provides service designers a mathematically tractable ap- proach to estimate the worst-case throughput and also avoid drastic over- provisioning in similar distributed key-value systems.

Second, to implement the high-speed load balancer and also to improve the space efficiency of individual key-value nodes, we propose novel data structures and algorithms, including the cuckoo filter, a Bloom filter replacement that is high-speed, highly compact and delete-supporting, and optimistic cuckoo hashing, a fast and space-efficient hashing scheme that scales on multiple CPUs. Both algorithms are built upon conventional cuckoo hashing but are optimized for our target architectures and workloads. Using them as building blocks, we design and implement MemC3 to serve transient data from DRAM with high throughput and low-latency retrievals, and SILT to provide cost-effective access to persis- tent data on flash storage with extremely small memory footprint (e.g., 0.7 bytes per entry).

134 pages



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]