AWS has released the next generation of Amazon OpenSearch Serverless, featuring a redesigned architecture that significantly improves resource provisioning, offers true scale-to-zero capabilities, and reduces costs. This update positions OpenSearch Serverless as a key building block for agentic AI applications, emphasizing its decoupled compute and storage architecture for enhanced scalability and efficiency.
Read original on InfoQ ArchitectureThe latest iteration of Amazon OpenSearch Serverless, dubbed NextGen, represents a significant architectural overhaul from its predecessor, the Classic architecture. This redesign focuses on addressing critical pain points experienced by users, particularly around resource provisioning times and cost efficiency for varying workloads. The primary goal was to achieve faster scaling, lower idle costs, and better support for emerging AI-driven search patterns.
The core of the NextGen architecture's improvements lies in its decoupled compute and storage layer. Unlike traditional designs where compute and storage are tightly coupled (e.g., local disks on compute instances), NextGen separates these concerns using a shared storage layer. This fundamental change has several profound implications for system design:
The NextGen architecture also introduces new endpoint formats for improved network resource management. While per-collection endpoints ( `.aoss..on.aws` ) still exist, a new per-account regional endpoint ( `.aoss..on.aws` ) allows access to all collections within an account via a single hostname. This can simplify client-side connection management by enabling single connection pools and TLS sessions across multiple collections.
Design Consideration: Multi-Tenancy and Cost Optimization
The introduction of Collection Groups in NextGen is a significant feature for multi-tenant architectures. By sharing compute capacity across multiple collections within a group, organizations can achieve greater cost reductions, especially for smaller workloads that might otherwise incur higher costs if each had dedicated compute. This design choice highlights a trade-off between strict isolation and resource efficiency.
aws opensearchserverless create-collection-group \
--name articles-cg \
--generation NEXTGEN \
--standby-replicas ENABLED \
--capacity-limits "minIndexCapacityInOCU=0,maxIndexCapacityInOCU=4,minSearchCapacityInOCU=4,maxSearchCapacityInOCU=2"
aws opensearchserverless create-collection \
--name articles-vectors \
--type VECTORSEARCH \
--collection-group-name articles-cg