Menu
Dev.to #systemdesign·June 13, 2026

Designing a Scalable Notification Fan-Out Service

This article explores the architectural considerations for building a notification service capable of fanning out messages (push, email, SMS) at massive scale, addressing challenges like high fan-out factors, diverse external API requirements, retries, and exactly-once delivery. It details a pipeline approach with queues to decouple stages and discusses the trade-offs between fan-out on write and fan-out on read strategies.

Read original on Dev.to #systemdesign

Core Challenges in Notification Fan-Out

Designing a notification service, especially one handling millions of recipients, presents several non-trivial challenges. These include managing a variable fan-out factor (from 1 to 2 million messages per event), integrating with multiple external providers (each with unique rate limits, error models, and protocols), ensuring reliable delivery with retry mechanisms, and preventing duplicate notifications (exactly-once semantics).

Fan-Out Strategies: Write vs. Read

A fundamental architectural decision is choosing between fan-out on write and fan-out on read. Each approach has distinct trade-offs:

  • Fan-out on Write: The event is immediately expanded into individual messages for each recipient and enqueued. This offers instant delivery and simpler readers but can lead to write storms for high-degree events and high storage costs for message copies.
  • Fan-out on Read: The event is stored once, and each recipient's device pulls the notification when it connects. This is cheaper to ingest and handles celebrity accounts gracefully, but delivery is not instant and readers do more work.
  • Hybrid Approach: Most production systems use a hybrid model. Low-follower accounts use fan-out on write for speed, while high-follower accounts use a deferred write-fan-out or fan-out on read to mitigate spikes and resource starvation.

Notification Pipeline Architecture

The article proposes a decoupled, queue-based pipeline to handle the varying speeds of different stages and external dependencies. This modular design helps prevent slow components from blocking faster ones.

plaintext
event -> [ingest] -> [fan-out] -> [per-channel queues] -> [channel workers] -> providers
  1. Ingest: Accepts, validates, and durably stores the event, then acknowledges the producer. It's designed to be fast, without message expansion.
  2. Fan-out Worker: Resolves recipients, applies user preferences (e.g., muting channels), and publishes one message per (recipient, channel) to the appropriate channel queue. This is where preferences are honored to avoid unnecessary processing.
  3. Channel Queues: Dedicated queues for each channel (push, email, SMS) ensure that an outage or slowdown in one provider doesn't impact other notification types.
  4. Channel Workers: Pull messages from their respective queues, call the external provider via an adapter, and manage response handling, retries, and dead-lettering.
💡

Adapter Pattern for External Providers

Implementing a `ChannelAdapter` interface for each external provider (e.g., EmailAdapter, SMSAdapter) abstracts away provider-specific logic, allowing channel workers to interact with a unified interface and simplifying provider switching or adding new channels.

notificationsfan-outqueuesmessagingscalabilitysystem design interviewmicroservices architectureasynchronous processing

Comments

Loading comments...