Cloudflare Blog·May 13, 2026

Scaling Cloudflare Browser Run: Architecture Evolution with Containers, D1, and Queues

This article details Cloudflare's architectural evolution for its Browser Run platform, migrating from shared infrastructure to dedicated Cloudflare Containers, D1, and Queues. The rebuild addresses critical scaling, performance, and reliability challenges, particularly for spiky workloads and global distribution, enhancing capabilities for web scraping, end-to-end testing, and AI agents. Key system design decisions involved optimizing global latency, state management, and throughput for high-volume concurrent browser instances.

Distributed Systems Performance & Scaling Cloud & Infrastructure

Read original on Cloudflare Blog

Cloudflare's Browser Run service, which provides programmatic control over headless browser instances, underwent a significant architectural overhaul to improve scalability, performance, and reliability. Initially sharing infrastructure with Browser Isolation (BISO), Browser Run faced limitations due to larger BISO container images, suboptimal global distribution, and conflicting usage patterns (BISO's long, steady sessions vs. Browser Run's short, spiky demand). The core of the migration involved leveraging Cloudflare's Durable Object (DO)-enabled Containers to provide dedicated, optimized infrastructure.

Addressing Global Latency and State Management

A key challenge in the new Container-based architecture was managing latency for interactive browser sessions distributed globally. While DO-enabled Containers spin up close to the request, the actual browser container might be far, leading to increased WebSocket latency. Cloudflare's solution involved creating regional pools of pre-warmed DO-backed browser containers. This strategy constrains the maximum distance and, therefore, maximum latency between the Durable Object orchestrating the session and the browser container itself. When a request arrives, the system selects the closest available DO-container pair within the user's region, optimizing both user-to-DO and DO-to-container hops.

From KV to D1 + Queues for Real-time State

The platform initially used Cloudflare Workers KV for managing container state. However, KV's eventual consistency (with a cache TTL of ~30 seconds) became a critical bottleneck for rapidly scaling to meet demand spikes, especially with the surge from AI agents. Race conditions and overallocation occurred because the cached state could be stale, leading to multiple requests attempting to claim the same 'available' browser.

⚠️

Challenge with Eventual Consistency for Critical State

Using an eventually consistent store like KV for frequently updated, real-time state where immediate consistency is crucial (e.g., resource allocation) can lead to race conditions, over-provisioning, and hinder rapid scaling. Consider transactional databases or stronger consistency models for such scenarios.

To overcome this, Cloudflare migrated the container state to D1 instances, Cloudflare's SQLite-based serverless database. D1's transactional nature ensures atomic assignment of browsers, preventing race conditions. To handle the high write volume (each of several thousand containers updating its state every 5 seconds), they adopted a batching strategy using Cloudflare Queues. Containers push their state updates to regional queues, and a Worker consumer processes these in batches of 100 with a 1-second timeout, achieving significantly higher throughput (up to 500,000 container updates per location).

sql

WITH candidate_pool AS (
  -- candidate pool logic to pick based on latency and other rules
)
UPDATE containers SET status = 'picked'
WHERE sessionId IN (
  SELECT sessionId FROM candidate_pool ORDER BY RANDOM() LIMIT ?5
)
RETURNING data

This architectural shift not only addressed the scaling and consistency issues but also enabled faster development and deployment of new features, as Browser Run gained dedicated control over its browser container images, decoupling it from BISO's release cycles and requirements.

CloudflareServerlessDistributed StateDatabaseQueuesContainersLow LatencyScalability

Comments

Loading comments...

Architecture Design

Design this yourself

Design a globally distributed headless browser service for AI agents and web scraping, similar to Cloudflare Browser Run. Focus on architectural decisions for managing dynamic, real-time state of thousands of concurrent browser instances, ensuring low-latency access, and supporting spiky workloads. Detail how you would handle regional distribution, allocate resources, and maintain strong consistency for critical operations like browser assignment while achieving high write throughput.

Practice Interview

Other design angles

· Design a distributed resource allocator for a multi-tenant SaaS platform that requires near real-time state updates and strong consistency for resource assignment, considering eventual consistency limitations.· Architect a real-time analytics pipeline for streaming events from geographically dispersed ephemeral workers, focusing on data ingestion, batch processing, and ensuring data freshness for operational dashboards.· Design a system to manage and orchestrate ephemeral containers in a globally distributed environment, focusing on efficient startup times, low-latency communication, and robust failure handling.

Scaling Cloudflare Browser Run: Architecture Evolution with Containers, D1, and Queues

Addressing Global Latency and State Management

From KV to D1 + Queues for Real-time State

Comments

Architecture Design

Related Lessons