CMU-CS-23-128
Computer Science Department
School of Computer Science, Carnegie Mellon University



CMU-CS-23-128

Optimal Scheduling in Multiserver Queues

Isaac Grosof

Ph.D. Thesis

July 2023

CMU-CS-23-128.pdf


Keywords: Queueing theory; scheduling; optimal scheduling; response time; multiserver; tails; M/G/1; M/G/k; dispatching; Shortest Remaining Processing Time; SRPT; First-Come First-Served; FCFS; Nudge; Guardrails; work; Multiserver-job; ServerFilling; ServerFilling-SRPT; DivisorFilling; WCFS; Finite Skip; RESET; MARC; Gittins index; heavy-traffic; stochastic improvement; tail probability

Scheduling theory is a key tool for reducing latency (i.e. response time) in queueing systems. Scheduling, i.e. choosing the order in which to serve jobs, can reduce response time by an order of magnitude with no additional resources. Scheduling theory is well-developed in single-server systems, where one job is processed at a time. However, little is known about scheduling in multiserver systems, where many jobs are processed at once. Results are especially limited in stochastic multiserver scheduling theory. Today's datacenters have thousands of servers, and scheduling theory is unable to analyze such systems.

This thesis proves the first optimality results and first closed-form bounds on mean response time for scheduling policies in stochastic multiserver models which reflect the behavior of modern computing systems. The thesis explores three themes:

  1. I start by studying one-server-per-job multiserver models, and prove the first results on optimal scheduling in that setting. Optimality results are proven for both a central-queue model and a dispatching model. I invent a novel class of dispatching policies, guardrails, to achieve these results.
  2. Next, I study the multiserver-job (MSJ) model, where different jobs require different amounts of resources to be served. I prove the first characterization of mean response time for any scheduling policy in the MSJ model, as well as the first optimality results. I invent novel scheduling policies, ServerFilling and ServerFilling-SRPT, to achieve these results.
  3. Finally, I study the effects of scheduling on the tail of response time, rather than mean response time. The prior state-of-the-art for scheduling for the tail was First-Come First-Served, which was conjectured to achieve optimal asymptotic tail of response time. I invent a novel scheduling policy, Nudge, which I prove to be the first policy to outperform FCFS's asymptotic tail of response time.

325 pages

Thesis Committee:
Mor Harchol-Balter (Chair)
Alan Scheller-Wolf
Anupam Gupta
Weina Wang
Michael Mitzenmacher (Harvard University)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science


Return to: SCS Technical Report Collection
School of Computer Science

This page maintained by [email protected]