This article discusses common anti-patterns in caching, such as cache invalidation and the thundering herd problem, which can paradoxically increase load on backend systems. It explores strategies to prevent these issues, focusing on techniques like pre-warming, probabilistic caching, and using message queues to orchestrate cache updates and reduce concurrent requests.
Read original on Medium #system-designWhile caching is a fundamental technique for performance optimization and load reduction in distributed systems, improper implementation can lead to significant problems. Two major issues are traffic spikes during cache invalidation and the 'thundering herd' problem, where many clients simultaneously request data not found in the cache, overwhelming the backend.
The Thundering Herd Problem
This occurs when a cache item expires or is invalidated, and numerous concurrent requests for that item bypass the cache, simultaneously hitting the backend data source. This can lead to a sudden, massive increase in backend load, potentially causing system degradation or outages. The problem is exacerbated in highly distributed systems with many clients.
Implementing these strategies requires careful consideration of trade-offs, including consistency models, additional infrastructure (like message queues or distributed locks), and potential increases in latency for the cache-filling request. The choice of strategy depends on the specific use case, data volatility, and acceptable staleness.