This article outlines seven critical architectural patterns to scale Node.js WebSocket applications from basic local setups to handling over a million concurrent real-time connections. It focuses on addressing common scalability bottlenecks such as inter-instance communication, connection state management, load balancing, and efficient resource utilization, providing practical solutions with code examples.
Read original on Dev.to #systemdesignAchieving high concurrency with WebSockets, especially in Node.js environments, requires careful architectural considerations beyond simple single-instance deployments. The article details common pitfalls that cause WebSocket applications to fail under load and offers production-ready patterns to overcome these challenges, enabling real-time applications to scale effectively.
When scaling a WebSocket server horizontally across multiple instances behind a load balancer, direct in-memory broadcasting no longer works. Messages sent to one instance won't reach clients connected to other instances. The solution is to use a distributed messaging system like Redis Pub/Sub. Each WebSocket instance subscribes to a shared channel, and when a message is received, it's published to Redis, which then fans it out to all connected instances. Each instance then locally broadcasts the message to its own set of connected clients, ensuring global message delivery.
import { WebSocketServer } from 'ws';
import { createClient } from 'redis';
const wss = new WebSocketServer({ port: 3000 });
const pub = createClient({ url: process.env.REDIS_URL });
const sub = createClient({ url: process.env.REDIS_URL });
await pub.connect();
await sub.connect();
const localClients = new Set<WebSocket>();
wss.on('connection', (ws) => {
localClients.add(ws);
ws.on('message', async (msg) => {
await pub.publish('chat', msg.toString());
});
ws.on('close', () => localClients.delete(ws));
});
await sub.subscribe('chat', (message) => {
for (const client of localClients) {
if (client.readyState === client.OPEN) {
client.send(message);
}
}
});Without sticky sessions (or session affinity), a client might reconnect to a different server instance after a temporary disconnection, losing any in-memory state. Implementing session affinity at the load balancer level ensures that a client consistently connects to the same backend server. This reduces the complexity and overhead of synchronizing session state across instances, although it can impact even load distribution. IP hash or cookie-based affinity are common methods.
Nginx IP Hash Example
Using `ip_hash` in Nginx pins client connections to a specific backend server based on their IP address, maintaining session affinity for WebSocket connections.