Menu
Dev.to #systemdesign·March 1, 2026

Scaling Node.js WebSockets for 1 Million Concurrent Connections

This article outlines seven critical architectural patterns to scale Node.js WebSocket applications from basic local setups to handling over a million concurrent real-time connections. It focuses on addressing common scalability bottlenecks such as inter-instance communication, connection state management, load balancing, and efficient resource utilization, providing practical solutions with code examples.

Read original on Dev.to #systemdesign

Achieving high concurrency with WebSockets, especially in Node.js environments, requires careful architectural considerations beyond simple single-instance deployments. The article details common pitfalls that cause WebSocket applications to fail under load and offers production-ready patterns to overcome these challenges, enabling real-time applications to scale effectively.

Inter-Instance Communication: Redis Pub/Sub

When scaling a WebSocket server horizontally across multiple instances behind a load balancer, direct in-memory broadcasting no longer works. Messages sent to one instance won't reach clients connected to other instances. The solution is to use a distributed messaging system like Redis Pub/Sub. Each WebSocket instance subscribes to a shared channel, and when a message is received, it's published to Redis, which then fans it out to all connected instances. Each instance then locally broadcasts the message to its own set of connected clients, ensuring global message delivery.

typescript
import { WebSocketServer } from 'ws';
import { createClient } from 'redis';

const wss = new WebSocketServer({ port: 3000 });
const pub = createClient({ url: process.env.REDIS_URL });
const sub = createClient({ url: process.env.REDIS_URL });

await pub.connect();
await sub.connect();

const localClients = new Set<WebSocket>();
wss.on('connection', (ws) => {
  localClients.add(ws);
  ws.on('message', async (msg) => {
    await pub.publish('chat', msg.toString());
  });
  ws.on('close', () => localClients.delete(ws));
});

await sub.subscribe('chat', (message) => {
  for (const client of localClients) {
    if (client.readyState === client.OPEN) {
      client.send(message);
    }
  }
});

Load Balancing Strategies: Session Affinity

Without sticky sessions (or session affinity), a client might reconnect to a different server instance after a temporary disconnection, losing any in-memory state. Implementing session affinity at the load balancer level ensures that a client consistently connects to the same backend server. This reduces the complexity and overhead of synchronizing session state across instances, although it can impact even load distribution. IP hash or cookie-based affinity are common methods.

💡

Nginx IP Hash Example

Using `ip_hash` in Nginx pins client connections to a specific backend server based on their IP address, maintaining session affinity for WebSocket connections.

Connection Management: Heartbeats and Backpressure

  • Heartbeats (Ping/Pong): Dead connections silently consume resources. Implementing periodic ping/pong messages allows servers to detect unresponsive clients and terminate stale connections, freeing up memory and file descriptors.
  • Backpressure: When a server sends data faster than a client can receive it, buffers can overflow. Proper backpressure mechanisms (e.g., checking `ws.bufferedAmount` and pausing message sending) prevent memory exhaustion on the server and improve client stability, especially in scenarios with slow consumers.
WebSocketsNode.jsScalabilityReal-timeRedis Pub/SubLoad BalancerSession AffinityHigh Concurrency

Comments

Loading comments...