This article details a practical implementation of a caching strategy for frequently accessed video categories on a platform. It focuses on using a repository pattern with a file-based cache to reduce database load and improve response times for category-related data. The implementation highlights basic cache-aside patterns and cache invalidation techniques.
Read original on Dev.to #architectureThe article presents a straightforward approach to managing and caching video categories, which are critical for navigation and content organization on video platforms. The core problem addressed is the high query load on the database due to categories being requested on almost every page.
The chosen caching strategy is a cache-aside pattern implemented within a `CategoryRepository`. When category data is requested, the system first checks the cache. If data is found, it's returned immediately. If not, the data is fetched from the database, stored in the cache, and then returned.
The `DataCache` class abstractifies the caching mechanism, utilizing a file-based approach. It serializes and unserializes data to/from files within a specified cache directory. While simple for demonstration, this approach has limitations in a distributed environment.
System Design Consideration
For higher scale or distributed systems, a file-based cache would be replaced by a distributed cache like Redis or Memcached. This allows multiple application instances to share the same cache, improving cache hit rates and consistency across the system.
SELECT c.id, c.name, c.slug, c.icon_emoji, COUNT(v.id) as video_count
FROM categories c
LEFT JOIN videos v ON v.category_id = c.id AND v.is_active = 1
GROUP BY c.id
HAVING video_count > 0
ORDER BY video_count DESCThe SQL query itself includes a `LEFT JOIN` with the `videos` table to count active videos per category and filters out categories with zero active videos. This pre-aggregation helps ensure that the cached category data is complete and accurate, reducing subsequent application-level processing.