Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.
Endigest AI Core Summary
Pinterest introduced a GPU-served two-tower model using MMOE-DCN architecture for lightweight ads engagement prediction.
•The two-tower design separates Pin (ad) embeddings via offline batch updates and user embeddings via real-time inference, scoring via dot product sigmoid
•Architecture shifted from Multi-Task Multi-Domain (MTMD) to MMOE with MLP gating and full-rank/low-rank DCN layers per expert
•GPU serving enabled the more complex model while maintaining latency comparable to the CPU baseline
•Training optimizations included GPU prefetch, fused kernels, BF16 precision, larger batch sizes, and tuned worker threads
•Achieved 5–10% offline loss reduction for CTR prediction; separating standard and shopping ad training doubled iteration speed and reduced loss further 5–10%
This summary was automatically generated by AI based on the original article and may not be fully accurate.