|Machine Learning

Scaling Recommendation Systems with Request-Level Deduplication

2026-04-13

10 min read

by Pinterest Engineering

Tags:

machine-learning

infrastructure

engineering

recommendation-system

Read Original

Get the latest tech trends every morning

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

Endigest AI Core Summary

Pinterest shares their technique of request-level deduplication to manage infrastructure costs when scaling recommendation systems with 100x increased model parameters.

•Request-level deduplication eliminates redundant processing of user data across multiple items scored in recommendation pipeline
•Apache Iceberg with user/request-based sorting achieves 10-50x storage compression on user-heavy feature columns
•SyncBatchNorm fixes 1-2% regression from disrupted IID assumption when using request-sorted data by aggregating batch statistics across devices
•User-level masking in loss function addresses false negatives (up to 30%) that arise from multiple user engagements grouped together in batches
•Techniques apply across full ML lifecycle including storage optimization, training correctness and speedups, and serving throughput gains

Scaling Recommendation Systems with Request-Level Deduplication

Get the latest tech trends every morning

Endigest AI Core Summary

Related Articles

When history fails you, borrow from geography

How Trustpilot built a real-time architecture for data enrichment using Gemma

Introducing Mellum2: A 12B Mixture-of-Experts Model by JetBrains

Slack AI: The Path to Multi-Cloud