Cloudflare re-engineered its Cloudflare One Client's proxy mode to improve performance in zero-trust environments. The original design, relying on a Layer 4 to Layer 3 conversion with WireGuard and a user-space TCP stack (smoltcp), introduced significant overhead. The new architecture leverages QUIC for direct Layer 4 proxying, eliminating the inefficient conversion layer and taking advantage of modern transport-layer optimizations.
Read original on Cloudflare BlogCloudflare identified a significant performance bottleneck in the proxy mode of its Cloudflare One SASE client. The initial implementation for zero-trust environments prioritized universal compatibility by acting as a local SOCKS5 or HTTP proxy. However, the architectural choice to tunnel Layer 4 (TCP) application traffic over a Layer 3 (WireGuard) protocol, especially without kernel-level support across multiple operating systems, led to a complex and inefficient design.
The original design involved converting application-layer TCP streams into Layer 3 packets for the WireGuard tunnel. This conversion was handled by `smoltcp`, a Rust-based user-space TCP implementation. While functional, `smoltcp` is optimized for embedded systems and lacked support for modern TCP features, creating a performance ceiling. Additionally, an inverse conversion from L3 back to an L4 stream was required at the Cloudflare edge, further adding latency and reducing throughput. This approach resulted in sluggish browser speeds, slow file transfers, and poor video call quality, particularly on media-heavy sites with numerous concurrent connections.
To overcome these limitations, Cloudflare completely re-built the proxy mode to leverage QUIC (Quick UDP Internet Connections). By utilizing MASQUE (Multiplexed Application Substrate over QUIC) for IP proxying and QUIC streams for direct L4 proxying, the new architecture avoids the problematic L4 to L3 conversion. HTTP/3 (RFC 9114) with the CONNECT method is now used to encapsulate browser requests directly into QUIC streams, keeping traffic at Layer 4.
Architectural Benefits of QUIC for Proxying
The shift to QUIC provides significant advantages: elimination of the `smoltcp` layer, direct benefits from QUIC's native modern congestion and flow control, and enhanced tuneability of QUIC parameters for optimized performance. This resulted in doubled download/upload speeds and significantly reduced latency in internal testing.
This architectural change significantly benefits scenarios requiring coexistence with third-party VPNs, high-bandwidth application partitioning, and developers using SOCKS5 for CLI tools. The improved proxy mode ensures that adding zero-trust security does not come at the cost of user experience, allowing for high-definition streaming and large data transfers with maintained performance.