Menu
Hacker News·June 17, 2026

Lore: A Scalable, Content-Addressed Version Control System

Lore is an open-source, centralized version control system designed by Epic Games for extreme scalability, particularly for projects combining code with large binary assets. Its architecture leverages content-addressed storage using Merkle trees and an immutable revision chain for data integrity and efficient deduplication. Key design choices focus on optimizing for large teams and high-throughput scenarios, enabling on-demand data hydration and lightweight branching.

Read original on Hacker News

Lore, developed and maintained by Epic Games, presents a novel approach to version control, targeting the challenges of scaling for large projects that incorporate both code and substantial binary assets. Traditional VCS systems often struggle with performance when managing vast repositories and accommodating large development teams, especially in domains like game development and entertainment.

Core Architectural Principles

Lore's design is rooted in a centralized, content-addressed storage model. This means that repository data is identified and retrieved based on its cryptographic hash, rather than its location or name. This fundamental choice underpins several key scalability and integrity features:

  • Merkle Trees: Repository states are represented as Merkle trees, which enable rapid verification of data integrity and efficient detection of changes. Each node's hash depends on the hashes of its children, ensuring tamper-evidence.
  • Immutable Revision Chain: Revisions form an immutable chain where each revision's hash is derived from its state, parent revision hashes, and data hashes. This provides cryptographic integrity and a verifiable source of truth.
  • Chunked Storage: Large files are broken down into smaller, reusable chunks. This significantly reduces data duplication across different versions and branches, making updates and transfers more efficient, particularly for binary assets.

Scalability and Efficiency Mechanisms

💡

Design Insight: Deduplication for Binaries

The content-addressed nature combined with chunking is crucial for optimizing storage and network bandwidth, especially with frequently changing large binary files common in game development.

  • On-demand Hydration and Sparse Workspaces: Developers only download the file data they need, keeping local workspaces lightweight and speeding up initial setup and branch switching. This is a significant advantage over systems that require a full clone.
  • Centralized Service with Caching: The architecture employs a service-backed model with caching layers in front of durable storage. This design allows the system to scale throughput for concurrent operations from large teams and manage vast repositories effectively.
  • Lightweight Branches: Branches are implemented as mutable references, not full data copies. This ensures that creating and switching between branches is a low-overhead operation, promoting rapid experimentation and iteration without incurring significant storage or performance penalties.
version controlVCSscalabilitycontent-addressed storageMerkle treesbinary assetsdeduplicationdistributed storage

Comments

Loading comments...