CMU-CS-15-122
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-15-122

JBösen: A Java Bounded Asynchronous Key-Value Store for Big ML

Yihua Fang

July 2015

M.S. Thesis

CMU-CS-15-122.pdf


Keywords: Distributed System, Machine Learning, Parameter Server, Stale Synchronous Parallel Consistency Model, Big Data, Big Model, Big ML, Data-Parallelism

To effectively use distributed systems in Machine Learning (ML) applications, practitioners are required to possess a considerable amount of expertise in the area. Although highly abstracted distributed system frameworks such as Hadoop can help to reduce the complexity of writing code for distributed systems, their performances are incomparable to that of specialized implementations. Other efforts such as Spark and GraphLab each has it’s own downsides. In light of this observation, the Petuum Project is indented to provide a new framework for implementing highly efficient distributed ML applications through a high-level programming interface.

JBösen is a Java key value store system in Petuum using the Parameter Server (PS) paradigm and it aims to extend the Petuum project to Java as almost a quarter of the programmer population use Java. It provides an iterative-convergent programming model that covers a wide range of ML algorithms and an easy-to-use programming interface. JBösen, unlike other platforms, exploits the error tolerance property of ML algorithms to improve performance with a Stale Synchronous Parallel (SSP) consistency model to relax the overall consistency of the system by allowing the workers to access older and more staled values.

43 pages

Thesis Committee:
Eric P. Xing (Chair)
Kayvon Fatahalian

Frank Pfenning, Head, Computer Science Department
Andrew W. Moore, Dean, School of Computer Science



Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]