CMU-CS-24-149 Computer Science Department School of Computer Science, Carnegie Mellon University
Designing Efficient and Scalable Key-value Juncheng Yang Ph.D. Thesis December 2024
Software caches have been widely deployed at scale in today's computing infrastructure to improve data access latency and throughput. These caches consume PBs of DRAM across the industry, necessitating high efficiency – reducing DRAM consumption without compromising the miss ratio.Meanwhile, modern servers have hundreds of cores per CPU, making thread scalability a critical requirement for designing software caches. This thesis explores multiple approaches from system and algorithm perspectives to improve the efficiency and scalability of software caches. This thesis has two parts. The first part focuses on new system designs that allow caches to store more objects in the cache to achieve a low miss ratio. In this part, I will describe three works. First, I will discuss a large-scale production key-value cache workload analysis. Second, drawing on insights from the workload study, I will describe the design of Segcache, a TTL-indexed segment-structured key-value cache that removes expired objects quickly, uses tiny object metadata, and enables close-to-linear scalability. Third, I will present C2DN and demonstrate how to use erasure coding to design a highly efficient fault-tolerant CDN cache cluster. The second part focuses on new algorithms that allow the cache to store more useful objects in the cache, which is also critical for cache efficiency. In this part, I will discuss four works. First, I will investigate the design of a low-overhead learned cache. Existing caches that use machine learning often incur significant storage and computation overheads. I will show a new approach – learning on the group level, which amortizes overheads and accumulates more information for better learning. While GL-Cache is faster than existing learned caches, it is still more complex compared to state-of-the-art heuristics. In the following chapter, I will discuss two foundational techniques, lazy promotion and quick demotion, which enable us to design simple yet effective eviction algorithms. In the third chapter of this part, I will demonstrate an example using the two techniques, S3-FIFO, a new eviction algorithm only composed of FIFO queues but achieves higher efficiency and scalability than 11 state-of-the-art algorithms. In the last chapter of this part, I will introduce SIEVE, a new eviction algorithm that uses a single queue to achieve lazy promotion and quick demotion. SIEVE is simpler than LRU, but achieves state-of-the-art efficiency and scalability. Throughout this thesis, I will demonstrate how we leverage large-scale measurements to obtain insights for new system and algorithm designs, which allow modern software caches to achieve high efficiency and close-to-linear scalability. 298 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by [email protected] |