Seamless Istio Upgrades at Scale
2025-08-07
11 min read
1
by Rushy R. Panchal
Endigest AI Core Summary
This post explains how Airbnb safely upgrades Istio across tens of thousands of pods on dozens of Kubernetes clusters without workload team coordination.
- •Airbnb follows Istio's canary upgrade model, running two Istiod revisions simultaneously so workloads on different versions can still communicate
- •A rollouts.yml file controls version distribution per namespace using consistent hashing, enabling gradual, deterministic rollouts and easy rollbacks
- •Krispr, an in-house mutation framework, injects the correct Istio revision label at CI time and again at pod admission, decoupling upgrades from workload deployments
- •Node and pod max lifetime of two weeks guarantees all Kubernetes workloads are upgraded within four weeks even without redeployment
- •VM workloads are handled separately via a pull-based agent that reads rollouts.yml and applies the correct Istio version on each VM
Tags:
#open-source
#kubernetes
#engineering
#infrastructure
#istio
