Menu
InfoQ Architecture·May 22, 2026

Uber Eats Real-Time Restaurant Recommendation System

Uber Eats improved its restaurant recommendation system by incorporating real-time user signals and a listwise ranking approach. This new architecture shifts from batch processing to a real-time signal processing layer, enabling faster adaptation to user preferences within a browsing session. The system focuses on aligning offline training with online serving to ensure consistent model behavior and reduce latency in personalization outcomes.

Read original on InfoQ Architecture

Evolution to Real-Time Recommendation Architecture

Uber Eats has significantly upgraded its recommendation engine, moving away from a traditional batch-oriented feature pipeline to a near-real-time signal processing layer. This architectural shift is crucial for enhancing user experience by allowing the system to rapidly adapt to immediate user intent and evolving preferences within a single session. Instead of processing historical data in batches, user interactions such as clicks, searches, and order history are ingested continuously, providing an up-to-date representation of user behavior.

Listwise Ranking for Improved Efficiency and Quality

A core enhancement to the ranking component is the adoption of listwise ranking. Unlike pointwise ranking, where each candidate restaurant is scored individually, listwise ranking evaluates multiple restaurant candidates simultaneously in a single inference step. This approach allows the model to optimize the relative ordering of options, leading to better computational efficiency and higher ranking quality by considering candidates in their full contextual set.

Key Architectural Components:

  • Real-time Signal Processing Layer: Ingests user interactions (clicks, searches, orders) continuously.
  • Unified User Behavior Representation: Combines short-term session activity with long-term historical signals.
  • Shared Feature Extraction Layer: Ensures consistency between offline training and online serving environments.
  • Generative Recommender-style Model: Leverages transformer-based sequence modeling for home feed recommendations.
  • Separated Inference and Preprocessing: Improves scalability by isolating the serving layer's focus on ranking from feature computation and aggregation.
💡

Training-Serving Skew Mitigation

A critical design consideration highlighted by Uber is the alignment between training and serving pipelines. By applying the same feature-extraction logic across both environments and simulating production conditions through historical session replays for training, Uber effectively minimizes feature drift and ensures that models perform consistently in live production.

The infrastructure is engineered to meet low-latency demands typical for consumer-facing recommendation surfaces. This involves a clear separation of concerns, with feature preprocessing handled upstream and the serving layer dedicated to high-speed ranking decisions. This modularity contributes to improved scalability and efficiency under high traffic loads.

recommendation systemreal-time processingmachine learninglistwise rankingdata pipelinessystem architectureubereatspersonalization

Comments

Loading comments...