|Machine Learning

LLM-Powered Relevance Assessment for Pinterest Search

2025-12-10

10 min read

by Pinterest Engineering

Tags:

machine-learning

experimentation

engineering

Read Original

Get the latest tech trends every morning

Receive daily AI-curated summaries of engineering articles from top tech companies worldwide.

Endigest AI Core Summary

Pinterest Search presents a methodology for scaling search relevance assessment using fine-tuned LLMs to replace costly human annotation.

•A cross-encoder architecture fine-tunes open-source multilingual LLMs (XLM-RoBERTa-large selected for balance of quality/speed) on a 5-level relevance scale using human-annotated data
•Pin representation combines titles, descriptions, BLIP image captions, board titles, and engaged query tokens as textual features
•Stratified query sampling design replaces simple random sampling, using a query-to-interest model and popularity segments to define strata
•LLM labeling reduced Minimum Detectable Effects (MDE) from 1.3–1.5% down to ≤0.25%, primarily through variance reduction via stratification
•

XLM-RoBERTa-large labels 150,000 rows in 30 minutes on a single A10G GPU; LLM labels achieve 73.7% exact match and 91.7% within-1-point agreement with human annotators

LLM-Powered Relevance Assessment for Pinterest Search

Get the latest tech trends every morning

Endigest AI Core Summary

Related Articles

Slack AI: The Path to Multi-Cloud

Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL

Specialization Beats Scale: A Strategic Variable Most AI Procurement Decisions Overlook

Making User-Sequence Data More Cost-Efficient, Faster, and Easier to Use