Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Endigest AI Core Summary
Slack's Deploy Safety Program reduced customer impact hours by 90% over 18 months by overhauling deployment practices and safety culture.
•73% of customer-facing incidents were triggered by Slack-induced change, primarily code deploys, which motivated the program's North Star goals.
•Goals targeted automated detection and remediation within 10 minutes, manual remediation within 20 minutes, and limiting problematic deploys to under 10% fleet exposure.
•The team invested broadly at first, then doubled down on successful patterns—starting with Webapp backend metric monitoring, then expanding to automatic rollbacks across frontend and infra.
•Automatic rollbacks were the key turning point: manual remediation alone was insufficient, and automation drove dramatic improvement in results.
•
A 3–6 month lag in trailing incident metrics required patience and executive alignment every 4–6 weeks to maintain program confidence.
This summary was automatically generated by AI based on the original article and may not be fully accurate.