This article details Shopee's architectural solutions for a flash sale engine, focusing on critical system design challenges like preventing overselling and managing 'hot keys' during massive traffic spikes. It explains how multi-level caching, atomic operations with Redis Lua scripts, and inventory sharding are employed to ensure high availability, data consistency, and performance under extreme load.
Read original on Dev.to #architectureFlash sales represent an extreme stress test for any system, where millions of users might simultaneously attempt to purchase a highly discounted item, creating a 'hot key' problem. Directly hitting a traditional relational database (like MySQL) with such a load would lead to immediate crashes due to row locks and deadlocks. The core challenge is to manage this immense demand without overselling or system failure.
While a distributed cache like Redis is crucial, a single Redis node can still be overwhelmed by requests for a 'hot key' due to network bandwidth and CPU limits (typically ~100k Ops/sec). To mitigate this, a multi-level caching strategy is essential:
Deducting inventory involves a critical race condition if not handled atomically (Read stock -> Check if > 0 -> Write new stock). Two concurrent requests could read the same stock value of 1, both decrement it, leading to overselling. Shopee's solution is to encapsulate the inventory deduction logic within Lua scripts executed directly on Redis. Since Redis is single-threaded, a Lua script executes as an atomic transaction, preventing any interruptions and guaranteeing data consistency.
local stock_key = KEYS[1]
local stock = tonumber(redis.call('GET', stock_key))
if stock and stock > 0 then
redis.call('DECR', stock_key)
return 1 -- Purchase Successful
else
return 0 -- Out of Stock
endEven with atomic operations, a single 'hot key' on a single Redis node can still be a bottleneck for extremely popular items. To address this, Shopee implements Inventory Sharding. Instead of storing the total stock (e.g., 1,000 iPhones) in one key, they divide it into multiple shards (e.g., 10 keys, each holding 100 items: `iphone_stock_1` to `iphone_stock_10`). These shards are then distributed across different physical Redis nodes. A load balancer or router randomly directs user requests to one of these sharded keys, effectively distributing the load and preventing any single node from being overwhelmed.