Menu
The New Stack·June 16, 2026

Databricks LTAP: Merging Transactional and Analytical Databases for AI Agents

This article introduces Databricks' Lake Transactional/Analytical Processing (LTAP) architecture, which aims to unify operational and analytical workloads in a single data layer. LTAP is designed to simplify data infrastructure for AI agents by eliminating ETL pipelines and data duplication, leveraging open formats and separate compute engines on a lakehouse foundation. It represents a significant architectural shift towards a unified data platform.

Read original on The New Stack

The Challenge: Dual Database Systems

Traditionally, enterprises maintain two distinct database systems: Online Transactional Processing (OLTP) for live business operations (e.g., orders, payments) and Online Analytical Processing (OLAP) for reporting and analysis. OLTP systems are optimized for fast writes and row-based storage, while OLAP systems are tuned for large scans and columnar storage. Bridging these systems typically involves complex ETL (Extract, Transform, Load) pipelines and data replication, leading to data staleness, increased operational overhead, and governance challenges.

Introducing Lake Transactional/Analytical Processing (LTAP)

Databricks' LTAP architecture proposes a unified approach to transactional and analytical data by consolidating them into a single storage layer. This approach is particularly targeted at the evolving needs of AI agents, which require the ability to reason over live transactional data and historical context simultaneously. Key principles of LTAP include:

  • Single Storage Layer: Data is stored once in open formats on cloud object storage (e.g., Delta Lake, Iceberg).
  • Separate Compute Engines: Despite unified storage, distinct compute engines are used for transactional (Lakebase) and analytical (Lakehouse//RT) workloads, optimizing each for its specific access patterns.
  • Elimination of ETL: By removing the need for data movement and duplication, LTAP aims to reduce complexity and ensure data freshness.

Key Components of LTAP

  • Lakebase: A Postgres-based operational database that separates compute from storage, placing data directly in the lake in open formats. It's extended with native vector/full-text search and real-time event ingestion (Zerobus).
  • Lakehouse//RT: A real-time analytics engine powered by a vectorized engine (Reyden) that runs directly on Delta and Iceberg tables. It aims to deliver millisecond-level latency on lakehouse data without needing a separate serving layer.
  • Mooncake & Neon Integration: Mooncake mirrors Postgres changes to the lakehouse in real-time for analytical queries, while Neon provides Git-style branching for databases, allowing AI agents to quickly fork, experiment, and discard database copies from object storage.
💡

System Design Implications

The LTAP architecture directly addresses challenges in designing data-intensive applications, particularly those requiring real-time insights from transactional data. It offers a blueprint for reducing data silos, simplifying data governance, and improving data freshness by consolidating the data plane. Architects should consider the trade-offs in adopting such a unified system, including potential complexities in managing a single, highly integrated data platform versus the benefits of simplified data pipelines and improved data consistency.

data lakehouseHTAPOLTPOLAPreal-time analyticsdata unificationAI agentsDatabricks

Comments

Loading comments...