Menu
InfoQ Architecture·July 3, 2026

Cloudflare's Town Lake: A Unified Lakehouse Platform for Data Integration and AI-Powered Analytics

Cloudflare developed Town Lake, a unified data platform, to consolidate operational, billing, security, and business data from fragmented sources like Postgres, ClickHouse, and Kafka. This platform leverages a lakehouse architecture with Apache Trino and Iceberg, providing a single SQL interface for complex queries across diverse data stores. An AI-powered agent, Skipper, sits atop Town Lake, offering natural language access to data for various internal workflows.

Read original on InfoQ Architecture

The Challenge of Fragmented Data at Scale

Cloudflare, processing over a billion events per second globally, faced significant challenges managing data distributed across numerous disparate systems including PostgreSQL, ClickHouse, Kafka, BigQuery, and object storage. This fragmentation complicated data discovery, analysis, and governance, leading to inefficient operations and increased complexity for engineers and analysts. The need for a unified approach to access and manage this vast and varied dataset became critical for supporting diverse internal workloads like billing, security, and business intelligence.

Town Lake: A Unified Lakehouse Architecture

To address their data fragmentation, Cloudflare built Town Lake, a unified data platform designed as a lakehouse architecture. Key components of this platform include:

  • Apache Trino: Provides a unified SQL interface, enabling queries to join data across different underlying systems (Postgres, ClickHouse, Iceberg) without data movement.
  • Apache Iceberg: Used for data lake table format, offering capabilities like schema evolution, hidden partitioning, and time travel.
  • Cloudflare R2: Object storage for cost-effective and scalable data retention.
  • DataHub: Manages metadata, facilitating data discovery and lineage tracking.
💡

Architectural Insight: Lakehouse Pattern

The lakehouse architecture combines the flexibility and cost-effectiveness of data lakes with the data management features (schema enforcement, ACID transactions) of data warehouses. This pattern is ideal for organizations dealing with diverse data types and needing both raw data storage and structured query capabilities, often reducing the need for separate data warehousing solutions for specific workloads.

Data Governance and AI-Powered Analytics with Skipper

A crucial aspect of Town Lake is its default closed governance model, ensuring data security and compliance. Newly ingested datasets are inaccessible until automated scanning (using an internal service called Skimmer, which combines automated classification with AI-based analysis) and human review for sensitive data (PII) detection are completed. This multi-layered approach helps maintain control over sensitive information. On top of Town Lake, Cloudflare developed Skipper, an AI-powered analytics agent that translates natural language user requests into validated SQL queries. Skipper leverages metadata, schema definitions, transformation lineage, and runtime inspection to improve query accuracy and provides an auditable trail for insights. Billing workloads currently account for a significant majority (53%) of queries on Town Lake, demonstrating the platform's immediate impact on operational efficiency.

data platformlakehouseapache trinoapache icebergcloudflare r2data governanceai agentsql

Comments

Loading comments...