Pinterest Engineering·December 8, 2025

Pinterest's AI-Assisted System for Real-Time Content Violation Prevalence Measurement

Pinterest engineered an AI-assisted system to measure the prevalence of policy-violating content in real-time. This system uses multimodal LLMs for scalable content labeling and weighted reservoir sampling to ensure statistically unbiased estimates, addressing the limitations of relying solely on user reports. The architecture enables proactive risk detection, data-driven policy adjustments, and efficient resource allocation for Trust & Safety.

AI & ML Infrastructure Distributed Systems Performance & Scaling

Read original on Pinterest Engineering

The Challenge of Measuring Content Violation Prevalence

Historically, Trust & Safety teams relied heavily on user reports to identify policy-violating content. However, this approach suffers from significant blind spots: under-reported harms (e.g., self-harm due to stigma), malicious actors not reporting content, lack of statistical power for rare categories, and high costs/latency associated with human review at scale. Pinterest needed a system to measure "prevalence" – the percentage of all views on a given day that went to violative content – to provide a more accurate, real-time understanding of platform safety.

Core Architectural Components

Inputs: Engagement data (impressions, clicks) at the entity × day level, along with latest production risk scores from enforcement models. Missing scores are imputed.
Sampling: A weighted reservoir sampler selects images from the daily user impressions stream. This sampler uses risk scores and impression counts to improve efficiency while ensuring the final estimates remain unbiased through re-weighting. It supports both probability proportional-to-size with replacement (PPSWR) and pure random sampling for validation.
Labeling: A multimodal LLM (vision + text) bulk-labels the sampled content. Prompts are reviewed by policy subject matter experts (SMEs). The system logs decisions, rationales, and full lineage for auditability, and is designed to be model-agnostic.
Estimation: Computes overall prevalence and various pivots (by policy area, surface, sub-policy), persisting estimates, weights, and labels to production stores. Diagnostics and lineage are also recorded.
Dashboard & Alerting: Provides daily prevalence with 95% confidence intervals (CI), sample positive rates, auxiliary score distributions, and run health/lineage. Allows slicing by various dimensions.

ℹ️

Key Design Principle: Decoupling Measurement from Enforcement

The system strategically uses production risk scores during sampling to focus labeling budget on high-risk, high-exposure content. Crucially, the estimator then re-weights these samples using inverse-probability weighting (Hansen–Hurwitz or Horvitz–Thompson ratios) to remove the 'lensing' introduced by the risk scores. This ensures the prevalence statistic accurately reflects impressions and is unbiased, even if enforcement model thresholds or calibrations drift. This decoupling is vital for maintaining measurement integrity and comparability over time.

Addressing Challenges and Trade-offs

The team addressed several challenges: rare categories having wide CIs (handled by adapting sampling parameters, stratification, or pooling to weekly data), policy/prompt drift (managed by versioning and backfills), LLM decision quality stability (continuous monitoring, human validation of subsamples, and SME-labeled gold sets), and cost optimization (tracking token usage and exploring multi-step LLM labeling). These considerations highlight the trade-offs between precision, speed, and resource utilization in a large-scale ML-driven system.

Impact and Future Directions

The AI-assisted prevalence system provides Pinterest with proactive risk detection, dramatically faster labeling turnaround (15x faster), and significantly lower operational costs. This enables quicker root cause analysis, data-driven policy iteration, strategic decision-making (benchmarking, goal setting, resource allocation), and precise A/B testing of enforcement strategies. Future work includes expanding pivoting capabilities, further cost optimization (e.g., fine-tuning LLMs, multi-step labeling), and human-in-the-loop denoising/debiasing to refine LLM accuracy.

AIMachine LearningLLMData PipelineSamplingTrust & SafetyContent ModerationScalability

Comments

Loading comments...

Architecture Design

Design this yourself

Design a real-time content prevalence measurement system for a large social media platform. The system should accurately estimate the percentage of user views exposed to policy-violating content daily, using AI-assisted labeling and robust sampling techniques. Focus on scalability, unbiased measurement, auditability, and quick iteration for policy enforcement. Include components for data ingestion, weighted sampling (considering auxiliary risk scores), LLM-based content classification, and dashboarding with confidence intervals.

Focus: AI-assisted content prevalence measurement system with weighted sampling and LLM labeling

Other design angles

· Design a data pipeline for continuous content moderation feedback, focusing on the integration of human-in-the-loop systems to correct and improve AI models for policy enforcement.· Architect a resilient, distributed sampling service capable of processing billions of events daily, ensuring representative samples for various analytical and machine learning tasks across a platform.· Design an audit logging and lineage tracking system for an AI-driven content moderation workflow, ensuring that all decisions, model versions, and policy updates are traceable and verifiable.