Performance Characterization of a Taxonomy of Threading Models for Mid-tier Servers

October 2017

Poster

Abstract

Modern On-Line Data Intensive (OLDI) applications have evolved from monolithic systems to instead comprise numerous, distributed microservices interacting via Remote Procedure Calls (RPCs). Microservices face sub-millisecond (sub-ms) RPC latency goals, much tighter than their monolithic counterparts that must meet ≥ 100 ms latency targets. Sub-ms–scale threading and concurrency design effects that were once insignificant for such monolithic services can now come to dominate in the sub-ms–scale microservice regime. We investigate how threading design critically impacts microservice tail latency by developing a taxonomy of threading models—a structured understanding of the implications of how microservices manage concurrency and interact with RPC interfaces under wide-ranging loads. We develop μTune, a system that has two features: (1) a novel framework that abstracts threading model implementation from application code, and (2) an automatic load adaptation system that curtails microservice tail latency by exploiting inherent latency trade-offs revealed in our taxonomy to transition among threading models. We study μTune in the context of four OLDI applications to demonstrate up to 1.9x tail latency improvement over static threading choices and state-of-the-art adaptation techniques.

Type

Preprint

Publication

Accepted as a poster in the Career Workshop for Women and Minorities in Computer Architecture held in association with the International Symposium on Microarchitecture (CWWMCA ‘17)

Akshitha Sriraman

Assistant Professor

I am an Assistant Professor in the Department of Electrical and Computer Engineering at Carnegie Mellon University. My research bridges computer architecture and software systems, with a focus on making hyperscale data center systems more efficient (via solutions that span the systems stack). The central theme of my work is to (1) design software that is aware of new hardware constraints and (2) architect hardware that efficiently supports new hyperscale software requirements.