Menu
AWS Architecture Blog·June 22, 2026

Secure Multi-Tenant RAG Architecture with Fine-Grained Authorization in AWS

This article presents an architecture for implementing secure, multi-tenant Retrieval Augmented Generation (RAG) applications within an enterprise, focusing on fine-grained access control. It demonstrates how to use a single Amazon Bedrock Knowledge Base instance to serve multiple departments while isolating document access through metadata filtering and dynamic policy evaluation with Amazon Verified Permissions. The core design involves a two-layer defense-in-depth authorization strategy, allowing rule updates without code redeployment.

Read original on AWS Architecture Blog

The Challenge of Multi-Tenant RAG Authorization

Large organizations often struggle with providing internal generative AI applications like RAG while maintaining strict control over document access for different teams and roles. Duplicating infrastructure per group is costly and complex. The goal is to allow a single RAG application to serve multiple departments, ensuring employees only access authorized material, while also accommodating executives who might need cross-departmental access. This architecture focuses on logical isolation within a single tenant rather than hard multi-tenant isolation for separate customers.

Core Architectural Pattern: Metadata Filtering with Externalized Policies

The solution leverages a single, shared Amazon Bedrock Knowledge Base. Document isolation is achieved by tagging documents with department-specific metadata during ingestion. Instead of embedding filter selection logic directly in application code, authorization decisions are externalized to Amazon Verified Permissions using Cedar policies. This allows dynamic, runtime-evaluated authorization, enabling access rule changes without code redeployment and providing a detailed audit trail.

ℹ️

Defense-in-Depth Authorization

The architecture implements a two-layer defense-in-depth authorization strategy: 1. API Access (Layer 1): An AWS Lambda Authorizer on Amazon API Gateway calls Verified Permissions to decide if a user can invoke the API at all. 2. Document Access (Layer 2): A middleware Lambda function, orchestrating calls to the Knowledge Base, also queries Verified Permissions to determine which resources (document tags) the user is permitted to query, constructing a metadata filter for the `RetrieveAndGenerate` API. Each layer operates independently, providing resilience if one layer fails or is bypassed.

Ingestion and Query Flow

  1. Ingestion Pipeline: Documents uploaded to Amazon S3 trigger an EventBridge event, routed via SQS to a Lambda function. This Lambda processes documents, extracting metadata, and stores it in a sidecar (e.g., DynamoDB) alongside the document in S3, ensuring documents are tagged with appropriate department metadata before being indexed by Bedrock.
  2. Query Flow: A client makes a request to the API Gateway. The Lambda Authorizer (Layer 1) checks basic API access via Verified Permissions. If authorized, the request proceeds to a middleware Lambda (Layer 2). This middleware Lambda uses the user's identity (e.g., from Cognito JWT) to query Verified Permissions, which evaluates Cedar policies to determine authorized document tags. The middleware then constructs a metadata filter based on these tags and passes it to the Bedrock `RetrieveAndGenerate` API. Only documents matching the filter are used for RAG.
⚠️

Isolation Model and Residual Risk

This pattern provides filter-level (logical) isolation, not IAM-enforced (infrastructure) isolation. The underlying Knowledge Base is a shared resource. A failure in the middleware logic constructing the filter could expose documents from other groups. For hard cross-tenant isolation where compliance mandates infrastructure separation, a dedicated Knowledge Base per tenant with IAM boundaries is recommended, with this pattern layered on top for finer-grained control within each tenant.

The architecture promotes efficiency by avoiding separate Knowledge Base instances per department, reducing cost and operational overhead. The dynamic policy evaluation through Amazon Verified Permissions significantly improves agility in managing access rules, eliminating the need for code changes and deployments for policy updates.

RAGMulti-tenancyAuthorizationAWS BedrockAWS Verified PermissionsMetadata FilteringServerlessAccess Control

Comments

Loading comments...
Secure Multi-Tenant RAG Architecture with Fine-Grained Authorization in AWS | SysDesAi