InfoQ Architecture·June 30, 2026

AWS Lambda MicroVMs for Stateful, Isolated, Multi-Tenant Workloads

AWS Lambda MicroVMs introduce a new serverless compute primitive designed for long-running, stateful, and multi-tenant applications that execute untrusted user or AI agent code. This offering addresses the limitations of traditional VMs, containers, and Lambda Functions by providing VM-level isolation, near-instant snapshot-based launch, and state preservation, eliminating critical trade-offs for secure and efficient multi-tenant environments. It enables architects to build highly isolated serverless applications at scale, particularly for AI agent code execution and secure SaaS offerings.

Cloud & Infrastructure Distributed Systems Security

Read original on InfoQ Architecture

Addressing the Multi-Tenant Isolation Challenge

AWS Lambda MicroVMs emerged to fill a critical gap in serverless computing: the need for isolated, stateful, and long-running execution environments for multi-tenant applications, especially those running untrusted code. Traditional solutions presented a three-way trade-off:

Virtual Machines (VMs): Offer strong isolation but suffer from slow startup times (minutes).
Containers: Launch quickly but share a kernel, requiring significant hardening for untrusted code and posing security risks.
Lambda Functions: Optimized for event-driven, stateless, short-lived request-response patterns, not long-running interactive sessions that need to retain state.

ℹ️

Key Innovation

Lambda MicroVMs resolve this dilemma by combining VM-level isolation (via Firecracker), near-instant launch (from pre-initialized snapshots), and stateful execution with suspend/resume capabilities. This enables secure, efficient, and interactive experiences for workloads like AI agent execution and user-provided code in SaaS platforms.

Architectural Design and Execution Model

The execution model for MicroVMs differs significantly from standard Lambda Functions. Developers create a MicroVM Image by uploading a Dockerfile and code to S3. AWS then runs the Dockerfile, initializes the application, and captures a running memory and disk state snapshot using Firecracker. Subsequent MicroVMs launched from this image resume from this pre-initialized snapshot, bypassing cold boot times.

Snapshot-based Launch: Every MicroVM starts from a pre-initialized snapshot, ensuring rapid cold start times and consistent environments.
Suspend/Resume Lifecycle: Crucial for interactive use cases, MicroVMs can suspend (snapshotting memory and disk state) during idle periods and resume with all state intact (installed packages, loaded models, working filesets) when activity returns. This optimizes cost while preserving user context.
Hardware-level Isolation: Leverages Firecracker, the same VMM powering Lambda Functions, providing a dedicated VM per user session with no shared kernel or resources, significantly enhancing security for untrusted code execution.

Trade-offs and Use Cases

While offering powerful capabilities, MicroVMs come with specific trade-offs, primarily related to cost. They represent a premium service compared to Fargate spot pricing, necessitating careful consideration of idle-to-active ratios. However, the benefits in terms of security and statefulness unlock critical use cases:

Untrusted Code Execution: Securely run user-provided or AI-generated code in isolation, preventing container escapes.
Multi-tenant SaaS Applications: Provide dedicated, isolated, and stateful execution environments for each tenant.
AI Agents: Host long-running, stateful AI agents that require persistent context.
Highly Isolated Serverless Workloads: Ideal when container-level isolation is insufficient and full VMs are too resource-intensive or slow.

AWS LambdaMicroVMsFirecrackerServerlessIsolationMulti-tenancyStatefulAI Infrastructure

Comments

Loading comments...

Architecture Design

View Architecture

Design a multi-tenant SaaS platform that allows users to run untrusted code (e.g., custom scripts, AI agent code) in isolated, stateful environments. Incorporate AWS Lambda MicroVMs to manage these user sessions, focusing on how to handle provisioning, state preservation with suspend/resume, secure networking, and cost optimization based on user activity.

Practice Interview

Focus: serverless micro-VMs for isolated, stateful, multi-tenant code execution

Other design angles

· Design a secure, scalable AI agent platform where each user's AI agent runs in an isolated, stateful environment using MicroVMs. Detail the architecture for agent lifecycle management, prompt handling, and secure data access.· Design a collaborative online code editor where each user's session requires a dedicated, stateful, and highly isolated execution environment for running and testing code. Leverage MicroVMs for per-session isolation and rapid context switching.· Design an API platform that offers custom scripting capabilities to its enterprise clients, allowing them to extend functionality with untrusted code. Outline how Lambda MicroVMs would be integrated to provide secure and performant execution of these scripts.