Pinterest Engineering·April 27, 2026

Architecting Pinterest's Shopping Conversion Candidate Generation Model

This article details the architectural evolution of Pinterest's shopping conversion candidate generation model. It focuses on addressing challenges like data sparsity and noise in offsite conversion events through innovative training data design, feature engineering, and a two-tower model architecture utilizing parallel DCN v2 and a unified multi-task learning approach to optimize for both engagement and conversion.

AI & ML Infrastructure Distributed Systems Performance & Scaling

Read original on Pinterest Engineering

Pinterest's engineering team tackled the complex problem of optimizing shopping ads for offsite conversions, which are sparse, noisy, and delayed compared to onsite engagement signals. Their journey involved moving from an engagement-focused system to a dedicated conversion candidate generation model, highlighting key architectural and machine learning design decisions to improve advertiser value and user experience.

Training Data Design for Sparse Conversions

To counteract the sparsity and noise of offsite conversion data, several strategic design choices were made:

Multi-Surface Model: A single model was trained across all shopping surfaces (Homefeed, Related Pins, Search) to consolidate sparse conversion labels, while surface-specific features captured contextual differences.
Dual Positive Signals: Onsite engagement data (clicks, repins) augmented primary conversion signals, improving generalization. A log-based re-weighting function was applied to click duration to mitigate noise and false positives.
Negative Sampling: "Harder negatives" from ad impressions without engagement were used alongside in-batch negatives to expose the model to a more representative inventory, fostering robust contrastive learning.

💡

Key Learning: Handling Data Sparsity and Noise

Balancing high-value, sparse signals (conversions) with abundant, noisier signals (engagement) is a common challenge in real-world ML systems. Techniques like multi-task learning with weighted losses or auxiliary tasks are crucial for stable training and avoiding signal dilution.

Model Architecture Innovations

The core of the system is a two-tower retrieval model, encoding user and Pin features separately using DCN v2 for cross-feature interactions. Two significant architectural evolutions improved performance:

Parallel DCN v2 and MLP Cross Layers: This innovative architecture allows both the DCN v2 cross network and a parallel MLP to learn directly and simultaneously from the same input features. This eliminates information bottlenecks of sequential designs, enabling richer explicit feature interactions (DCN v2) and implicit abstract pattern learning (MLP) without signal loss.
Unified Multi-Task Architecture: Moving from a multi-head structure (separate engagement and conversion heads) to a unified single-head multi-task architecture allowed final embeddings to directly benefit from multi-task optimization during serving. An advertiser-level loss function was introduced to address high variance in Pin-level conversion data, providing more stable supervision and significantly boosting conversion recall.

Impact and Key Takeaways

The architectural and modeling advancements led to significant improvements, including a 2.3% increase in shopping conversion volume, a 2.7% lift in impression to conversion rate, and a 3.1% improvement in Return on Ad Spend (RoAS). This demonstrates the power of iterative architectural refinement and sophisticated machine learning techniques to solve complex business problems at scale.

machine learningrecommendation systemads platformcandidate generationdeep learningDCNmulti-task learningPinterest

Comments

Loading comments...

Architecture Design

Design this yourself

Design a high-scale ad recommendation system's candidate generation component, specifically focusing on optimizing for sparse offsite conversion events while balancing with engagement signals. Detail the data engineering strategies for handling noisy and delayed conversion data, and the architectural design of a two-tower neural network with advanced cross-feature interaction layers (e.g., parallel DCN v2 and MLP) and a unified multi-task learning loss function for simultaneous engagement and conversion optimization, including strategies for stable training with advertiser-level supervision.

Practice Interview

Focus: shopping conversion candidate generation model with two-tower architecture, parallel DCN v2, and unified multi-task learning

Other design angles

· Design a personalized product recommendation system for an e-commerce platform that incorporates both real-time user behavior and long-term preferences, detailing the candidate generation and ranking stages.· Architect a low-latency, high-throughput content recommendation service that can serve millions of users, focusing on the infrastructure and data pipelines required to train and deploy complex deep learning models.· Design a system for real-time bidding in an ad exchange, emphasizing the prediction models for click-through rate and conversion rate, and the infrastructure for bid optimization and serving.