This article details Dropbox's approach to improving storage efficiency in Magic Pocket, their exabyte-scale immutable blob store. It highlights the challenges of fragmentation in immutable systems and introduces a multi-strategy compaction system (L1, L2, L3) designed to efficiently reclaim fragmented space and reduce storage overhead, especially in scenarios with severely under-filled volumes.
Read original on Dropbox TechImmutable blob stores, like Dropbox's Magic Pocket, offer significant benefits in data durability and consistency. However, they introduce a unique challenge: when files are updated or deleted, the old data is not immediately overwritten but remains on disk until explicitly reclaimed. This leads to data fragmentation, where live data is sparsely distributed across many volumes, increasing storage overhead and costs. At exabyte scale, even minor increases in overhead translate to substantial infrastructure expenses, necessitating robust space reclamation strategies.
Erasure Coding vs. Replication
Dropbox uses erasure coding for most data, which splits data into fragments and adds parity fragments for fault tolerance. This provides similar durability to replication (storing multiple full copies) but with significantly less storage overhead. However, efficient redundancy alone isn't enough; continuous compaction is still critical to manage fragmentation.
Dropbox developed a multi-strategy compaction system to address varying levels of fragmentation. The core process involves garbage collection (identifying unreferenced blobs) and compaction (physically moving live blobs from fragmented volumes to new, denser volumes, then retiring the old ones). The incident with the Live Coder service, which caused widespread under-filled volumes, exposed limitations in their initial steady-state compaction strategy (L1) and prompted the development of more aggressive strategies.
The implementation of L2 compaction highlights several system design trade-offs. The dynamic programming approach for grouping volumes needs to be performant at scale. This was achieved by: capping the number of source volumes per run, and coarsening byte counts (granularity) to reduce the search space, thereby bounding compute and memory usage. These tunings allowed L2 to operate efficiently in production, demonstrating significant improvements in reducing compaction overhead and bringing storage usage back to sustainable levels.
Dynamic Programming for Resource Optimization
L2's use of dynamic programming for a "bounded packing problem" is a powerful technique. When designing resource allocation or optimization components, consider if similar combinatorial optimization algorithms, appropriately constrained (e.g., time, memory), can provide better resource utilization and performance. Trade-offs between packing quality and computational cost are crucial for production systems.