Endigest logo
Endigest
All Tech BlogsExplore TagsSend Feedback
Newsletter
Endigest logo
Endigest

© 2026 Endigest. All rights reserved.

  • About
  • Privacy
  • Terms
  • Contact
  • RSS

infrastructure Articles

21 articles

Related Tags

engineering(18)
technology(7)
observability(6)
pinterest(5)
software-architecture(3)
site-reliability-engineer(3)
machine-learning(3)
distributed-systems(3)
Uncategorized(2)
open-source(2)
migration(2)
workflow(1)
software-engineering(1)
engineering-culture(1)
ai(1)
  • 1
  • 2
Airbnb logoAirbnb
11 min read
Data Engineering•2026-05-19

Scaling Airbnb’s identity graph with a unified knowledge graph infrastructure

Airbnb migrated its identity graph from third-party PaaS to an internally-managed knowledge graph infrastructure built on JanusGraph and DynamoDB.

technology
graph-database
infrastructure
knowledge-graph
engineering
Airbnb logoAirbnb
11 min read
DevOps•2026-05-05

Monitoring reliably at scale

This post explains how Airbnb eliminated circular dependencies in its observability stack to ensure reliable monitoring at scale.

engineering
infrastructure
technology
observability
site-reliability-engineer
Netflix logoNetflix
61 min read
Machine Learning•2026-05-01

State of Routing in Model Serving

Netflix's Switchboard processes 1 million requests per second, providing centralized ML abstraction for clients.

ai-platform
distributed-systems
infrastructure
machine-learning
Pinterest logoPinterest
31 min read
Machine Learning•2026-05-01

Optimizing ML Workload Network Efficiency (Part I): Feature Trimmer

Pinterest optimized ML serving network efficiency by implementing Feature Trimmer to reduce bandwidth bottleneck.

engineering
pinterest
machine-learning
infrastructure
efficiency
Airbnb logoAirbnb
51 min read
Architecture•2026-04-28

Skipper: Building Airbnb’s embedded workflow engine

Skipper is Airbnb's embedded workflow engine designed to enable durable execution of multi-step business processes without requiring external orchestration infrastructure.

workflow
software-architecture
infrastructure
technology
engineering
Airbnb logoAirbnb
31 min read
DevOps•2026-04-21

Building a fault-tolerant metrics storage system at Airbnb

Airbnb built a metrics storage system ingesting 50 million samples per second and storing 1.3 billion active time series.

site-reliability-engineer
infrastructure
technology
engineering
software-architecture
Pinterest logoPinterest
41 min read
Backend•2026-04-20

Smarter URL Normalization at Scale: How MIQPS Powers Content Deduplication at Pinterest

Pinterest's MIQPS algorithm automatically learns which URL parameters affect content identity, enabling efficient deduplication across millions of merchant URLs at scale.

pinner-experience
engineering
infrastructure
eng-culture
pinterest
Pinterest logoPinterest
61 min read
Machine Learning•2026-04-13

Scaling Recommendation Systems with Request-Level Deduplication

Pinterest shares their technique of request-level deduplication to manage infrastructure costs when scaling recommendation systems with 100x increased model parameters.

pinterest
machine-learning
infrastructure
engineering
recommendation-system
Airbnb logoAirbnb
101 min read
DevOps•2026-04-07

Building a high-volume metrics pipeline with OpenTelemetry and vmagent

This post details a production migration of a large-scale metrics pipeline from StatsD to OpenTelemetry (OTLP) with Prometheus-based storage and vmagent for streaming aggregation.

engineering
technology
infrastructure
observability
site-reliability-engineer
Slack logoSlack
31 min read
DevOps•2026-03-31

From Custom to Open: Scalable Network Probing and HTTP/3 Readiness with Prometheus

Slack addresses HTTP/3 observability challenges through QUIC support in Prometheus Blackbox Exporter.

Uncategorized
golang
infrastructure
networking
observability
open-source
Pinterest logoPinterest
141 min read
AI•2026-03-19

Building an MCP Ecosystem at Pinterest

Pinterest describes how they built a production MCP (Model Context Protocol) ecosystem to enable AI agents to safely automate engineering tasks.

engineering-culture
pinterest
infrastructure
ai
engineering
Airbnb logoAirbnb
111 min read
DevOps•2026-03-17

From vendors to vanguard: Airbnb’s hard-won lessons in observability ownership

Airbnb shares hard-won lessons from migrating its observability platform from third-party vendors to a custom in-house solution built on Prometheus across 1,000 services.

engineering
technology
observability
infrastructure
Airbnb logoAirbnb
91 min read
DevOps•2026-03-04

It Wasn’t a Culture Problem: Upleveling Alert Development at Airbnb

Airbnb explains how they rebuilt their Observability as Code (OaC) alert development workflow to eliminate weeks-long validation cycles.

infrastructure
software-engineering
observability
technology
engineering
Pinterest logoPinterest
101 min read
Architecture•2026-02-24

Piqama: Pinterest Quota Management Ecosystem

Pinterest's Piqama is a generic quota management ecosystem that handles the full lifecycle of resource quotas across Big Data Processing and Online Services.

data-governance
distributed-systems
engineering
pinterest
infrastructure
Airbnb logoAirbnb
491 min read
DevOps•2026-02-18

Safeguarding Dynamic Configuration Changes at Scale

This post describes how Airbnb built "Sitar," their internal dynamic configuration platform for shipping runtime config changes safely at scale.

engineering
software-development
infrastructure
distributed-systems
software-architecture
Airbnb logoAirbnb
111 min read
DevOps•2025-10-09

From Static Rate Limiting to Adaptive Traffic Management in Airbnb’s Key-Value Store

Airbnb evolved Mussel, its multi-tenant key-value store, from simple QPS rate limiting to an adaptive traffic management system to maximize goodput during traffic spikes.

engineering
cloud-storage
key-value-store
cloud-services
infrastructure
Slack logoSlack
81 min read
DevOps•2025-10-07

Deploy Safety: Reducing customer impact from change

Slack's Deploy Safety Program reduced customer impact hours by 90% over 18 months by overhauling deployment practices and safety culture.

Uncategorized
automation
ci-cd
deployment
engineering
incident-response
infrastructure
observability
Airbnb logoAirbnb
71 min read
Architecture•2025-09-24

Building a Next-Generation Key-Value Store at Airbnb

Airbnb shares how they completely rearchitected Mussel, their internal key-value store for derived data, migrating from v1 to a NewSQL-based v2 running in production for over a year.

engineering
migration
infrastructure
storage
sql
Airbnb logoAirbnb
111 min read
Architecture•2025-09-16

Taming Service-Oriented Architecture Using A Data-Oriented Service Mesh

Airbnb introduces Viaduct, a data-oriented service mesh built on GraphQL that addresses the complexity of large-scale microservices dependency graphs.

java
microservices
infrastructure
graphql
Airbnb logoAirbnb
91 min read
DevOps•2025-08-13

Migrating Airbnb’s JVM Monorepo to Bazel

Airbnb completed a 4.5-year migration of their JVM monorepo (tens of millions of lines of Java, Kotlin, and Scala) from Gradle to Bazel.

monorepo
migration
engineering
infrastructure
bazel