Pinterest redesigned its user-sequence platform to efficiently generate, enrich, and serve user sequence data for machine learning models. This system addresses critical challenges in freshness, completeness, consistency, and cost-efficiency for data used in ranking, retrieval, and recommendation systems. The core innovation lies in a "one definition, many runtimes" approach, leveraging a shared execution engine and a Lambda architecture for both real-time and batch processing.
Read original on Pinterest EngineeringThe article details Pinterest's architectural overhaul of its user-sequence data platform, which is crucial for powering various machine learning models across their platform. User sequences are ordered lists of recent, relevant user events, enriched with additional signals like embeddings and contextual features. These sequences are vital for models that capture temporal behavior, such as Transformers, used in personalized recommendations, search, and ads.
Building a robust user-sequence platform at scale presents several challenges, particularly in a multi-tenant environment supporting numerous teams and models:
One Definition, Many Runtimes
This principle ensures a single, consistent definition for event filtering, enrichment, and sequence assembly. This definition is then applied across different runtimes: real-time indexing, batch indexing/backfill, and online serving. This prevents the common problem of data drift between training and serving systems.
The platform employs a Lambda architecture to reconcile the conflicting demands of data freshness and completeness. This involves distinct paths for real-time updates and batch processing, with a clear merge policy for eventual consistency. Key architectural decisions include:
This holistic approach enables Pinterest to deliver high-quality, cost-efficient, and consistent user sequence data, critical for the performance and evolution of their ML-driven recommendation and ranking systems.