Menu
Dropbox Tech·April 2, 2026

Optimizing Storage Efficiency in Immutable Blob Stores: Dropbox's Magic Pocket Compaction Strategies

This article details Dropbox's approach to improving storage efficiency in Magic Pocket, their exabyte-scale immutable blob store. It highlights the challenges of fragmentation in immutable systems and introduces a multi-strategy compaction system (L1, L2, L3) designed to efficiently reclaim fragmented space and reduce storage overhead, especially in scenarios with severely under-filled volumes.

Read original on Dropbox Tech

The Challenge of Immutability and Fragmentation

Immutable blob stores, like Dropbox's Magic Pocket, offer significant benefits in data durability and consistency. However, they introduce a unique challenge: when files are updated or deleted, the old data is not immediately overwritten but remains on disk until explicitly reclaimed. This leads to data fragmentation, where live data is sparsely distributed across many volumes, increasing storage overhead and costs. At exabyte scale, even minor increases in overhead translate to substantial infrastructure expenses, necessitating robust space reclamation strategies.

ℹ️

Erasure Coding vs. Replication

Dropbox uses erasure coding for most data, which splits data into fragments and adds parity fragments for fault tolerance. This provides similar durability to replication (storing multiple full copies) but with significantly less storage overhead. However, efficient redundancy alone isn't enough; continuous compaction is still critical to manage fragmentation.

Magic Pocket's Multi-Strategy Compaction System

Dropbox developed a multi-strategy compaction system to address varying levels of fragmentation. The core process involves garbage collection (identifying unreferenced blobs) and compaction (physically moving live blobs from fragmented volumes to new, denser volumes, then retiring the old ones). The incident with the Live Coder service, which caused widespread under-filled volumes, exposed limitations in their initial steady-state compaction strategy (L1) and prompted the development of more aggressive strategies.

  • L1 Compaction (Steady State): Designed for normal operations where most volumes are highly filled and fragmentation accumulates gradually. It acts as a packing problem, moving live data from partially filled *donor* volumes into a highly filled *host* volume to create a single new densely packed volume. It's simple and fast but inefficient for widely sparse volumes.
  • L2 Compaction (Aggressive Consolidation): Introduced to tackle a long tail of moderately under-filled volumes (e.g., less than half full). L2 uses a bounded dynamic programming approach to group multiple sparse volumes whose combined live bytes can nearly fill a new destination volume. This strategy reclaims space much faster by consolidating several sparse volumes simultaneously.
  • L3 Compaction (Extreme Sparsity): Targets the sparsest volumes with only a small fraction of live data. While not fully detailed in the provided text, it's implied to be even more aggressive than L2, handling the extreme tail of the distribution where L2's efficiency diminishes.

Key Design Considerations and Trade-offs

The implementation of L2 compaction highlights several system design trade-offs. The dynamic programming approach for grouping volumes needs to be performant at scale. This was achieved by: capping the number of source volumes per run, and coarsening byte counts (granularity) to reduce the search space, thereby bounding compute and memory usage. These tunings allowed L2 to operate efficiently in production, demonstrating significant improvements in reducing compaction overhead and bringing storage usage back to sustainable levels.

💡

Dynamic Programming for Resource Optimization

L2's use of dynamic programming for a "bounded packing problem" is a powerful technique. When designing resource allocation or optimization components, consider if similar combinatorial optimization algorithms, appropriately constrained (e.g., time, memory), can provide better resource utilization and performance. Trade-offs between packing quality and computational cost are crucial for production systems.

blob storageimmutable datastorage efficiencycompactionfragmentationexabyte scaleerasure codingdynamic programming

Comments

Loading comments...