Menu
Dev.to #architecture·May 27, 2026

Scaling Real-Time Treasure Hunts: Solving Unbounded State with a Two-Tier Architecture

This article details Veltrix's architectural evolution to support 20,000 concurrent players in a real-time treasure hunt, focusing on overcoming unbounded state issues from long-lived WebSocket connections. The solution involved a two-tier architecture, separating ephemeral WebSocket handling from stateful processing using Rust, Kafka, and RocksDB to significantly reduce memory footprint and improve stability.

Read original on Dev.to #architecture

The Challenge: Unbounded State with 20k Concurrent WebSockets

Veltrix faced significant scalability issues in their real-time treasure hunt game due to unbounded state associated with 20,000 concurrent, long-lived WebSocket connections. Each connection consumed approximately 2.3 MB for TCP reassembly buffers, quickly exhausting kernel resources (e.g., hitting `somaxconn` limits) and leading to connection failures. Initial attempts to mitigate this, such as enabling `reuseport` or setting `SO_KEEPALIVE`, failed because they didn't address the fundamental problem of persistent WebSocket state or inadvertently broke game logic by prematurely closing active player sessions.

Architectural Solution: A Two-Tiered Approach

The core architectural decision was to split the WebSocket layer into two distinct tiers to decouple real-time communication from long-lived player state:

  • Tier 1: Ephemeral WebSocket Shim (Rust with `tokio-tungstenite`): This layer is responsible solely for accepting WebSocket connections, forwarding BLE pings as CloudEvents to a Kafka topic (`ble-raw`), and then *closing the socket within 5 seconds* of the last ping. This design drastically reduced the memory footprint per connection to ~8 KB.
  • Tier 2: Dedicated Hunt Worker Pods (Kafka Streams, RocksDB): These stateless workers consume `ble-raw` via Kafka Streams in exactly-once mode. They maintain player state (e.g., current beacon, scores) in a compacted RocksDB state store on NVMe, with a TTL of 30 minutes. If a player is inactive for longer, their session state evaporates, and it's recreated upon their next beacon hit. This approach effectively moves long-lived state out of the connection layer and into a dedicated, fault-tolerant store.
💡

Key System Design Takeaway

Decoupling ephemeral communication channels from long-lived application state is a powerful pattern for scaling real-time systems. This allows the communication layer to remain lightweight and highly scalable, while state management can be handled by specialized, resilient services.

RocksDB for Time-Windowed State

RocksDB was chosen for its ability to maintain a time-windowed state. Its compaction filter was configured to automatically delete keys older than 30 minutes, keeping the database size manageable even with 20,000 active sessions. While this involved a trade-off from Redis's O(1) lookups to iterator-based range scans, the observed throughput hit was negligible for their use case.

Lessons Learned and Further Refinements

A crucial lesson learned was the importance of separating TTLs for different types of state. Initially, beacon state and player session TTLs were conflated within the same RocksDB key. A recommended improvement was to split these stores: one for live sessions (30 min TTL) and another for audit logs (90 days), potentially leveraging RocksDB's `SstFileManager` to store audit logs on S3. This would avoid issues like the 4.7-second cache rebuild during worker restarts and highlight the principle: never let a real-time protocol own long-lived state that can be derived.

WebSocketsReal-timeScalabilityKafkaRocksDBState ManagementRustGo

Comments

Loading comments...