Database sharding strategies: hash-based vs range-based in practice

·741 views

our main PostgreSQL database is starting to hit limits, particularly around write throughput, sometimes reaching 50k writes per second. we're building a B2B SaaS product, and we have a few extremely large tenants that contribute disproportionately to the load, which makes simple horizontal scaling difficult. we're looking into database sharding, and the main debate is between hash-based and range-based sharding. hash-based seems good for even distribution but makes querying for a specific range of tenant IDs harder. range-based allows for easier scaling of specific ranges but can lead to hot spots if not managed well, especially with our skewed tenant sizes. has anyone had practical experience with sharding a PostgreSQL database with highly skewed tenant data? did you go with a hybrid approach, maybe dedicating specific shards to large tenants and using hash-based for the rest? what were the operational complexities?

8 comments

Database sharding strategies: hash-based vs range-based in practice

Comments