Menu
InfoQ Architecture·June 22, 2026

Building a Scalable and Cost-Effective User Tracking Service at Delivery Hero

This article details Delivery Hero's journey in deprecating Google Analytics and developing an internal, highly scalable, and cost-effective user tracking platform. It covers the architectural decisions, challenges faced regarding data quality and real-time processing, and the strategies employed for testing, rollout, and continuous optimization. The system achieved superior data quality and lower costs compared to its predecessor.

Read original on InfoQ Architecture

Motivation for an Internal User Tracking System

Delivery Hero decided to replace Google Analytics with an in-house solution due to several critical limitations and external factors. The primary drivers included the need for real-time data (GA provided data only once or twice a day), unlimited event types (GA had definable event limits), and GDPR compliance concerns regarding storing sensitive user data with a third party. Cost optimization was also a significant factor, with an initial goal to not exceed GA costs, which was later surpassed by achieving a 25% cost reduction.

Core Architecture and Evolution

The user tracking system began with a simplistic, highly scalable architecture comprising a central API and two Pub/Sub message processors (one for fallback). This initial design allowed them to handle significant load with zero issues. Over time, as requirements evolved, additional services were introduced around this core API to address concerns like reliability, data validation, and diverse SDK support for mobile and frontend clients. Data is streamed to real-time consumers via Pub/Sub and stored in BigQuery for other consumers.

Key Architectural Components:

  • SDKs (Mobile & Frontend): To collect user interaction data.
  • API Gateway: Ingests tracking data from SDKs.
  • Pub/Sub: Serves as a reliable message bus for streaming data.
  • Processors: Read messages from Pub/Sub for initial processing and fallback scenarios.
  • Data Validation Service: Ensures data quality and integrity.
  • BigQuery: Primary data storage for analytical and batch consumers.
  • Real-time Consumers: Directly consume from Pub/Sub for immediate insights.
  • Curator Jobs: Additional services for data enrichment and management.
💡

Architectural Principle: Start Simple, Iterate and Expand

Delivery Hero's approach highlights the value of starting with a minimal, scalable core and incrementally adding complexity and specialized services as problems arise and requirements solidify. This iterative development avoids over-engineering upfront and allows the architecture to naturally evolve to meet growing demands.

Challenges and Solutions: Data Quality, Cost, and Scalability

The team focused on improving data quality, achieving an 85% order match rate with GA and exceeding 91% with their internal tool. This was driven by fixing SDK data loss issues and building a more reliable ingestion infrastructure. Cost optimization was measured by cost per message, leading to a 25% reduction compared to GA. Scalability was addressed through load testing with real data, including simulating peak loads three times higher than typical, which enabled the system to withstand unexpected traffic surges without incidents or data loss. The simplified architecture inherently supported high scalability.

Testing and Rollout Strategies

  • Doubled Pipeline Testing: To ensure no data loss during migration, a parallel pipeline was run where both GA and the new internal SDK sent data concurrently. This allowed for direct comparison and validation of data parity, albeit at a higher temporary cost.
  • Load Testing: Crucial for validating system resilience under extreme conditions, preventing revenue loss during high-traffic events.
  • KPI-Driven Development: Defining clear metrics like "order match rate" and "cost per message" enabled precise tracking of improvements and informed development priorities.
user trackinganalyticsreal-time datascalabilitycost optimizationGDPRmicroservicesevent-driven architecture

Comments

Loading comments...