Menu
Pinterest Engineering·May 21, 2026

Pinterest's User-Sequence Platform for ML Recommendations

Pinterest redesigned its user-sequence platform to efficiently generate, enrich, and serve user sequence data for machine learning models. This system addresses critical challenges in freshness, completeness, consistency, and cost-efficiency for data used in ranking, retrieval, and recommendation systems. The core innovation lies in a "one definition, many runtimes" approach, leveraging a shared execution engine and a Lambda architecture for both real-time and batch processing.

Read original on Pinterest Engineering

The article details Pinterest's architectural overhaul of its user-sequence data platform, which is crucial for powering various machine learning models across their platform. User sequences are ordered lists of recent, relevant user events, enriched with additional signals like embeddings and contextual features. These sequences are vital for models that capture temporal behavior, such as Transformers, used in personalized recommendations, search, and ads.

Key Challenges in User-Sequence Data Management

Building a robust user-sequence platform at scale presents several challenges, particularly in a multi-tenant environment supporting numerous teams and models:

  • Freshness: How quickly new events and enrichments are reflected in sequences for real-time inference.
  • Completeness: Ensuring late-arriving events, corrections, and backfills are eventually incorporated.
  • Consistent Enrichment: Maintaining uniform enrichment logic and data alignment between streaming and batch processes, preventing train-serve skew.
  • Stable Schemas: Providing predictable and versioned schemas for downstream consumers.
  • Cost-Efficiency: Managing the storage and processing costs associated with large volumes of sequence data.
  • Operability and Debugging: Making the complex multi-step process easier to monitor and troubleshoot.

Core Architectural Principles and Solutions

ℹ️

One Definition, Many Runtimes

This principle ensures a single, consistent definition for event filtering, enrichment, and sequence assembly. This definition is then applied across different runtimes: real-time indexing, batch indexing/backfill, and online serving. This prevents the common problem of data drift between training and serving systems.

The platform employs a Lambda architecture to reconcile the conflicting demands of data freshness and completeness. This involves distinct paths for real-time updates and batch processing, with a clear merge policy for eventual consistency. Key architectural decisions include:

  • Configuration-as-Code: Sequence and enrichment definitions are managed as code (Python), enabling faster onboarding, improved reviewability, and clear separation of concerns.
  • Shared Execution Engine: A central engine processes raw events into enriched records based on configuration. It handles data sources, filtering, featurization, and writing results. Pluggable executors within this engine encapsulate business-specific logic, minimizing code duplication between streaming and batch jobs.
  • Columnar, Time-Partitioned Storage: Sequence data is stored in a columnar format to allow models to read only necessary fields, optimizing storage and read performance. Time partitioning facilitates efficient writes and targeted scans.

System Components

  • Ingestion: Supports both streaming (Kafka) for real-time events and batch (data warehouses) for historical data.
  • Enrichment and Execution Layer: The shared engine applying configured filters, joins, and transforms to raw events.
  • Real-time Indexer: A streaming job for low-latency updates to a time-versioned online store.
  • Batch Indexer and Backfill Pipeline: Scheduled jobs for processing historical data and generating intermediate datasets.
  • Columnar, Time-Partitioned Storage: Where enriched sequence data resides.
  • Online Serving API: Exposes a clean API for fetching user sequences, performing request-time enrichments, and applying trimming logic for online inference.

This holistic approach enables Pinterest to deliver high-quality, cost-efficient, and consistent user sequence data, critical for the performance and evolution of their ML-driven recommendation and ranking systems.

user sequencesmachine learningdata platformreal-time processingbatch processinglambda architecturedata consistencyrecommendation systems

Comments

Loading comments...
Pinterest's User-Sequence Platform for ML Recommendations | SysDesAi