Sharding as a Scalability Pattern
Horizontal data partitioning for write scalability: shard key strategies, range vs hash sharding, cross-shard queries, and rebalancing.
Why Sharding?
Sharding (also called horizontal partitioning) splits a single large dataset across multiple independent database nodes, called shards. Each shard holds a subset of the data and handles the reads and writes for its partition. Sharding is the primary technique for scaling writes beyond what a single machine can handle — vertical scaling (bigger machines) eventually hits a ceiling, but you can always add more shards.
MongoDB, Cassandra, DynamoDB, and Vitess (used by YouTube and Slack) all implement sharding under the hood. Understanding sharding is a core expectation in senior system design interviews.
The Shard Key
The shard key is the attribute used to route each record to a specific shard. Choosing the wrong shard key is the most common sharding mistake. A good shard key distributes data evenly across shards (avoids hotspots), enables most queries to target a single shard (avoids scatter-gather), and is immutable (never changes once assigned to a record).
Hot Shard Anti-Pattern
Using a low-cardinality or temporally skewed shard key creates hot shards. For example, sharding by `status` (active/inactive) concentrates all writes on the 'active' shard. Sharding by `created_at` date means the current-day shard gets all new writes while all other shards are idle. Always measure your access distribution before choosing a shard key.
Range Sharding vs Hash Sharding
| Aspect | Range Sharding | Hash Sharding |
|---|---|---|
| How it works | Records with keys in a range [A-M] go to shard 1, [N-Z] to shard 2 | hash(key) % num_shards determines the target shard |
| Range queries | Efficient — data is sorted, single shard or contiguous shards | Scatter-gather — range spans all shards unpredictably |
| Write distribution | Can be skewed if new keys are always at the high end (e.g., time-series) | Uniform distribution regardless of key pattern |
| Rebalancing | Split ranges without moving all data | Adding a shard requires remapping and moving many keys |
| Best for | Time-series, ordered lookups, prefix scans | Key-value access, user IDs, random writes |
Consistent Hashing for Sharding
Simple modulo hashing (`hash(key) % N`) breaks badly when you add or remove shards — almost all keys map to different shards, requiring a massive data migration. Consistent hashing places both keys and shards on a virtual ring. Adding a shard only displaces keys from its immediate neighbor, limiting migration to `1/N` of the data. Redis Cluster and Amazon DynamoDB use consistent hashing with virtual nodes (vnodes) for smoother rebalancing.
Cross-Shard Queries
Sharding's biggest operational challenge is cross-shard queries — queries that cannot be answered by a single shard. Aggregations (`COUNT`, `SUM`), range scans across the shard key boundary, and `JOIN`s between records on different shards all require a scatter-gather approach: fan out the query to all shards in parallel, collect results, merge and sort in the application layer. This is slow and complex.
- Avoid scatter-gather: Design queries so the shard key is always in the filter. Prefer access patterns like 'all orders for user X' (route to user X's shard) over 'all orders placed in December' (hits all shards).
- Secondary index: Maintain a separate lookup table mapping secondary keys to shard IDs (the Index Table pattern, covered in the next lesson).
- Global aggregates: Pre-compute aggregates with background jobs and store in a dedicated analytics store (e.g., ClickHouse, Redshift).
Resharding and Rebalancing
As data grows, shards become full and must be split. Resharding is operationally expensive: data must be migrated, routing tables updated, and zero-downtime is hard to guarantee. Planning ahead is critical:
- Over-shard initially: create more logical shards (e.g., 1024) than physical nodes. Map multiple logical shards to each physical node. Adding capacity only requires moving logical shards to new nodes, not re-hashing data.
- Use a directory service: store shard-to-node mappings in a consistent store (ZooKeeper, etcd). Updating mappings is instant; data moves in the background.
- Double-write during migration: route writes to both old and new shard, verify consistency, then cut over reads.
Interview Tip
In an interview, after proposing sharding, always address three follow-ups proactively: (1) 'How did you choose the shard key?' — justify with access pattern analysis; (2) 'How do you handle cross-shard queries?' — mention scatter-gather and propose alternatives; (3) 'How do you rebalance?' — mention logical/virtual shards or consistent hashing. Showing you know the pain points of sharding demonstrates real production experience.
Real-World Sharding Examples
| Company | System | Shard Key Strategy |
|---|---|---|
| User photos stored in PostgreSQL | Shard by user ID using logical shard map | |
| Uber | Trip database | Shard by city ID for geographic locality |
| Discord | Messages in Cassandra | Partition by (channel_id, bucket) — time-bucketed |
| DynamoDB | All tables | Hash partition key + optional sort key for range |