Menu
Airbnb Engineering·June 9, 2026

Airbnb's Evolving Data Architecture for Multi-Product Growth

This article details how Airbnb evolved its offline data warehouse architecture to support the introduction of new product lines (Experiences, Services) alongside its traditional Homes business. It explores the critical trade-off between separate and monolithic data models, outlining a hybrid framework guided by foundational principles and modeling guidelines to balance consistency with flexibility. The solution addresses challenges in data standardization and managing technical debt within a rapidly expanding data ecosystem.

Read original on Airbnb Engineering

Challenge: Scaling Data Architecture for Multi-Product Expansion

Airbnb faced a significant challenge in evolving its decade-old offline data warehouse to accommodate new product lines like Experiences and Services, beyond its core Homes offering. The primary concern was how to expand data architecture without introducing chaos into vital analytics services, risking data silos, inconsistent analytics, and increased technical debt. This scenario highlights a common problem in rapidly growing organizations: how to maintain data integrity and analytical capability while diversifying product offerings.

Core Dilemma: Separate vs. Monolithic Data Models

The central architectural decision revolved around structuring offline data for a multi-product world. Two extreme approaches were considered:

  • Separate Data Models: Distinct table sets for each product, offering tailored data but risking duplicated logic.
  • Monolithic Model: A unified table set for all products, maximizing code reusability and consistency but potentially becoming unwieldy and less adaptable to unique product attributes.
💡

Balancing Act

Neither separate nor monolithic models are universally superior. The optimal choice depends heavily on the specific business domain. A balanced approach often involves a framework that combines centralized principles with decentralized modeling guidelines, allowing teams flexibility within a consistent structure.

Airbnb's Hybrid Framework: Principles and Guidelines

Airbnb adopted a hybrid framework by establishing three foundational principles for consistency and scalability, coupled with guidelines to empower teams to make domain-specific modeling decisions:

  1. Principle 1: No Hybrid Data Models within a domain. A domain's model must be either completely separate by product type or completely monolithic to avoid confusion and ensure scalability.
  2. Principle 2: Consistent Identifier Naming. Strict conventions for primary IDs (e.g., product-specific IDs for separate models, generic IDs with a type column for monolithic models) to ensure reliable joins.
  3. Principle 3: Clear Namespace Organization. Dedicated product namespaces for core, product-specific tables and a global namespace for monolithic, cross-cutting tables, supplemented by team-specific namespaces for flexibility.

Modeling guidelines helped teams choose between separate and monolithic approaches based on factors like shared vs. unique product attributes, future evolution, upstream alignment, downstream consumers, code maintainability, data volume & performance, and compatibility.

Application: When to go Separate vs. Monolithic

The most decisive factor was whether product lines shared mostly common data attributes or had significant unique attributes.

Model TypeUse CasesExamples from Airbnb
📌

Separate Data Models

Used for product-specific logic where attributes are highly distinct: * Listings: New "Service offerings" with many-to-one relationships to a parent service, unlike Homes/Experiences. * Availability: Flexible "business hours" for Services, distinct from calendar dates (Homes) or specific start times (Experiences). * Location: Radius-based "Service areas" for hosts who travel, different from fixed locations. * Guests: Distinct user journeys and high-volume interaction data for each product require separate models for performance and accuracy.

📌

Monolithic Data Models

Used for cross-cutting concepts that span multiple products: * Messaging: A single message thread can involve multiple product types, requiring a unified view. * Payments: Transactional data is largely product-agnostic, handling refunds and payouts uniformly. * Customer Support: Support requests often involve multiple products, necessitating a holistic view of user history.

data architecturedata warehousingdata modelingmulti-productscalabilityanalyticsmicroservicesdata governance

Comments

Loading comments...