This article details Netflix's architectural shift to a hierarchical 'Slow-Fast' personalized notification system. It addresses the challenge of optimizing for immediate user engagement versus long-term member satisfaction by decoupling strategic messaging plans from real-time message selection. The system leverages a feature store for asynchronous communication between the slow planning policy and the fast execution policy, enabling independent evolution and consistent user experiences.
Read original on Netflix Tech BlogNetflix's personalized notification system faces a core dilemma: maximizing immediate engagement (e.g., clicks on a notification) can conflict with a member's long-term satisfaction (e.g., preventing notification fatigue and opt-outs). Their previous single-policy system, which used a causal model to predict the effect of a single message over a short horizon, suffered from two main limitations:
Inspired by Daniel Kahneman's "Thinking, Fast and Slow," Netflix decoupled its notification engine into a two-layer system:
The key to decoupling these policies is asynchronous communication facilitated by a low-latency feature store. The Slow Policy calculates a member's ideal pacing plan (e.g., 3 push notifications, 2 emails per week) and writes this strategic intent to the feature store. The Fast Policy, upon a notification opportunity, pulls this stored plan as a feature and executes tactical send decisions within these pre-defined strategic limits.
Architectural Advantages of Decoupling
This hierarchical architecture provides significant benefits: it ensures a consistent member experience by honoring the stored plan ('Stickiness') and allows for independent evolution of the slow planning layer and the fast execution layer ('Independent Evolution'), meaning pacing strategies and content ranking models can be optimized and A/B tested separately without impacting each other.