When protections outlive their purpose: A lesson on managing defense systems at scale
2026-01-15
7 min read
2
Endigest AI Core Summary
GitHub shares a post-incident analysis on how emergency rate-limiting protections outlived their purpose and began incorrectly blocking legitimate users.
- •Protections added during past abuse incidents used composite fingerprinting signals (industry-standard techniques + platform-specific business logic) that later produced false positives.
- •Only 0.5–0.9% of fingerprint-matched requests were blocked, but those matching both criteria were blocked 100% of the time, affecting ~0.003–0.004% of total traffic.
- •Multi-layered infrastructure (built on HAProxy) made root cause tracing difficult, requiring log correlation across edge, application, and protection rule layers.
- •Without lifecycle management (expiration dates, post-incident reviews), temporary mitigations silently became permanent technical debt.
- •GitHub is now treating incident mitigations as temporary by default, improving cross-layer observability, and formalizing post-incident rule review practices.
