Menu
Medium #system-design·May 29, 2026

Choosing Databases Based on Core Data Structures

This article highlights that effective database selection should be driven by understanding the underlying data structures and their operational characteristics rather than marketing hype. It emphasizes that databases are essentially optimized implementations of fundamental data structures, influencing their performance, scalability, and suitability for various use cases.

Read original on Medium #system-design

The Essence of Databases: Data Structures

At their core, databases are sophisticated wrappers around fundamental data structures. Understanding these structures – like B-trees, hash tables, LSM-trees, or heaps – is crucial for making informed architectural decisions. Each data structure offers distinct advantages and trade-offs regarding read/write performance, storage efficiency, and consistency models, directly impacting how a database performs under specific workloads.

Key Data Structures and Their Database Applications

  • B-Trees/B+Trees: Found in traditional relational databases (PostgreSQL, MySQL). Excellent for ordered data, range queries, and indexing, offering balanced read/write performance.
  • Hash Tables: Powering key-value stores (Redis, Memcached). Ideal for fast lookups by key, offering O(1) average time complexity for reads and writes. Not suitable for range queries.
  • Log-Structured Merge-Trees (LSM-trees): Used in NoSQL databases (Cassandra, RocksDB). Optimized for write-heavy workloads, appending data sequentially and merging sorted segments in the background. Trades read amplification for write efficiency.
  • Heaps/Priority Queues: Less common as primary storage but used in specialized databases or for internal indexing and query optimization.
💡

System Design Implication

When designing a system, don't just pick 'SQL' or 'NoSQL'. Dive deeper: Is your workload read-heavy or write-heavy? Do you need strong consistency or eventual consistency? Are range queries critical? Your answers should guide you to a database whose underlying data structures naturally align with these requirements.

Selecting a database without considering its fundamental mechanisms can lead to significant performance bottlenecks, scalability issues, and operational overhead. For instance, using a B-tree-based database for a purely append-only, high-write-throughput log might result in excessive disk I/O and poor cache utilization compared to an LSM-tree-based solution.

Making Informed Database Choices

A robust system design involves matching the application's data access patterns and consistency requirements with the most suitable database technology. This requires understanding not just the marketing features, but the core engineering principles that dictate a database's behavior under load.

data structuresdatabase selectionSQLNoSQLB-treeLSM-treehash tablesystem architecture

Comments

Loading comments...