The Airbnb Tech Blog - Medium logoThe Airbnb Tech Blog - Medium
|DevOps

From Static Rate Limiting to Adaptive Traffic Management in Airbnb’s Key-Value Store

2025-10-09
11 min read
3
by Shravan Gaonkar

Endigest AI Core Summary

Airbnb evolved Mussel, its multi-tenant key-value store, from simple QPS rate limiting to an adaptive traffic management system to maximize goodput during traffic spikes.

  • Resource-aware rate control (RARC) charges each request in 'request units' (RU) based on rows processed, bytes, and latency rather than raw request counts.
  • Load shedding uses a latency ratio (long-term p95 / short-term p95) to detect system stress and automatically apply backpressure to lower-priority traffic classes.
  • A CoDel-inspired thread pool monitors queue wait times and drops requests early when the dispatcher is saturated, freeing resources for high-priority traffic.
  • Hot-key detection identifies skewed access patterns in real time to shield the storage backend from both legitimate surges and DDoS attacks via caching or request coalescing.
  • Criticality tiers ensure high-priority traffic (e.g., customer support, trust and safety) remains responsive even when capacity is exhausted.
Tags:
#engineering
#cloud-storage
#key-value-store
#cloud-services
#infrastructure