Menu
InfoQ Cloud·June 15, 2026

Decentralized Data Systems: Cloud-Native, Local-First, and the AT Protocol

This article, featuring Martin Kleppmann, author of 'Designing Data-Intensive Applications', explores the evolution of data systems, emphasizing the shift from monolithic to modular, cloud-native architectures. It delves into the trade-offs and design considerations for decentralized data storage, using BlueSky's AT Protocol as a case study, and highlights the 'local-first' software movement for enhanced user data agency and offline capabilities.

Read original on InfoQ Cloud

The Evolution of Data Systems: From Monoliths to Modular Cloud-Native Architectures

The past decade has seen a significant shift in data system design, moving away from large, monolithic databases towards modular, cloud-native architectures. Historically, distributed databases handled replication at the software level, storing data on local disks. Modern systems increasingly leverage cloud services like object stores (e.g., S3) as the fundamental storage abstraction. This approach inherently provides replication at the object store level, fundamentally altering how databases and other data-intensive applications are built on top. This decoupling of data storage from compute resources improves maintainability, performance, and flexibility, allowing for the composition of diverse building blocks like columnar file formats (Apache Parquet) and various query engines.

Decentralization and User Data Agency

A key theme is the importance of moving from cloud-centric to decentralized data storage to increase user agency and mitigate vendor lock-in. The article discusses two major approaches: pure federated models (like ActivityPub used by Mastodon) and designs that balance federation with global indexing and consistency (like BlueSky's AT Protocol). The AT Protocol aims to provide a decentralized social network experience that is indistinguishable from a centralized one, focusing on consistent views of data (e.g., reply threads, likes) across different servers, which is a notable trade-off compared to ActivityPub's emphasis on maximum decentralization.

💡

Designing for Decentralization

When designing a decentralized system, consider the balance between strict federation and the need for global consistency and seamless user experience. The AT Protocol prioritizes a consistent user view, even in a distributed environment, which introduces different challenges compared to systems optimizing for pure federation.

Local-First Software Movement

The local-first software movement champions the idea that the primary copy of user data should reside on the client device. This principle enables robust offline access, enhances data ownership, and reduces dependency on central services, thereby mitigating risks associated with vendor lock-in or service shutdowns. Libraries such as Automerge are crucial in building such applications, providing functionalities akin to Git-like version control and real-time collaboration, even for complex, non-textual data formats.

Key Takeaways for System Designers

  • Cloud-Native Architectures: Embrace object stores as the underlying replicated storage for databases and applications.
  • Modular Data Stacks: Fragment monolithic data systems into composable building blocks for increased flexibility and experimentation.
  • Decentralization Trade-offs: Carefully evaluate the balance between federated models and the need for global indexing and consistency (e.g., AT Protocol vs. ActivityPub).
  • User Agency & Local-First: Prioritize user data ownership by designing for local-first principles, enabling offline access and reducing vendor dependency. Utilize tools like Automerge for CRDT-based collaboration.
decentralizationcloud-nativeobject storagedata sovereigntylocal-firstAT ProtocolActivityPubCRDTs

Comments

Loading comments...