This article discusses the evolution from a cellular architecture to a hybrid multi-tenant model for stateful services on AWS, primarily focusing on ad-serving infrastructure. It highlights the challenges of scale, efficiency, onboarding time, and the noisy neighbor problem encountered with strict tenant isolation, then presents a solution that balances isolation with operational improvements. The new architecture leverages a three-level hierarchy (tiers, cells, infra groups) and AWS services like Route 53, ALB, ECS, and PrivateLink to achieve cluster-level isolation within shared accounts, significantly reducing operational overhead and improving scalability.
Read original on AWS Architecture BlogAd-serving infrastructure often faces the dilemma of balancing strong tenant isolation with operational efficiency, especially for stateful services. The article presents a real-world case study moving from a highly isolated cellular architecture, where each tenant received dedicated AWS accounts and infrastructure, to a more efficient hybrid multi-tenant approach.
The initial cellular architecture, while providing excellent isolation, introduced significant operational overhead and inefficiencies. Key problems included:
The new hybrid architecture aims to provide cluster-level isolation within shared accounts, addressing the previous inefficiencies. It introduces a hierarchical structure and leverages AWS services to streamline operations:
Key Design Principle
The primary reason for the 80% reduction in infrastructure setup is the architectural decision to pre-wire downstream service dependencies at tier creation, not at tenant onboarding. Once a tier is provisioned with PrivateLink connections to downstream services, all tenants onboarded to that tier automatically inherit full connectivity.