This article details Vercel's architectural choices in building "Hive," an internal platform that reduced build provisioning times from 90 to 5 seconds. It focuses on the challenges of hostile multi-tenancy and how Vercel leveraged Firecracker microVMs for strong isolation while maintaining high performance for ephemeral, customer-submitted build workloads. The core solution involves a layered approach combining microVMs, containerization, and advanced caching strategies to achieve both security and speed.
Read original on ByteByteGoVercel's platform executes customer-provided code on shared infrastructure, presenting a significant security challenge: hostile multi-tenancy. Unlike cooperative workloads, Vercel must assume any build script could be malicious, attempting to escape its sandbox or access other tenants' data. Traditional Docker containers, while providing packaging and some isolation, share the same Linux kernel, creating a potential blast radius for kernel exploits. This fundamental trust problem necessitated a stronger isolation primitive than standard container orchestration solutions like Kubernetes could easily provide without significant hardening.
To address the isolation problem, Vercel adopted AWS Firecracker microVMs as the core of their "Hive" platform. Firecracker provides VM-level isolation, with genuinely separate kernels and hardware-enforced boundaries, booting in milliseconds and using minimal memory. This offers a new sweet spot in the isolation-performance trade-off, previously unseen. Each customer build runs within a dedicated "cell" (a microVM), with a one-to-one relationship between Firecracker processes and cells.
Isolation Strategy
Vercel's approach layers isolation: Firecracker microVMs provide strong kernel-level isolation, while traditional containers run inside these microVMs to handle packaging, build tools, and dependencies. This allows each layer to excel at its primary function: microVMs for security, containers for convenience.
The Hive architecture consists of regional clusters, each with multiple Hives for failure isolation. Inside each physical machine (Box) within a Hive, a box daemon manages Firecracker processes and cell lifecycles, while a cell daemon inside each microVM manages the build container. Cells are ephemeral, destroyed after each build to prevent state leakage between tenants, a security decision prioritized over potential performance gains from reuse. CPU and memory are dedicated per cell, while disk and network throughput are rate-limited, balancing isolation with efficient resource utilization.
These three layers combine to reduce typical build provisioning from 90 seconds to 5 seconds. The architecture prioritizes security through strong isolation and ephemeral environments, then layers performance optimizations to deliver a fast and secure developer experience on a multi-tenant platform.