Datadog Blog·May 22, 2026

Securing AI Agents: Guardrail Placement in Self-Orchestrated vs. Managed Solutions

This article explores the architectural decision of where to place security guardrails in AI agent systems, comparing Amazon Bedrock Agents (managed) with self-orchestrated agents using custom solutions like Datadog AI Guard. It highlights the trade-offs between tightly coupled, vendor-managed guardrails and flexible, custom-implemented ones, particularly in mitigating indirect prompt injection.

AI & ML Infrastructure Security Distributed Systems

Read original on Datadog Blog

Architectural Choices for AI Agent Guardrails

Securing AI agents is a critical system design challenge. A core decision involves determining where in the agent's workflow to implement guardrails. This article contrasts two primary architectural patterns: using a fully managed AI agent service (like Amazon Bedrock Agents) where guardrails are often integrated by the vendor, versus building a self-orchestrated agent where guardrails are custom-implemented at various stages.

Managed vs. Self-Orchestrated Guardrails

Managed AI Agent Services (e.g., Amazon Bedrock Agents) often provide built-in guardrails. This simplifies development and deployment but can limit customization and observability. The guardrails are typically applied pre-processing (on user input) and post-processing (on LLM output). While convenient, these may not cover all intermediate steps or allow for highly specific, domain-aware logic. The vendor controls the guardrail's implementation and placement.

Self-Orchestrated Agents offer greater flexibility. With custom orchestration frameworks, developers can strategically place guardrails at multiple points: before sending user input to the LLM, after retrieving information from a RAG (Retrieval Augmented Generation) system, before tool execution, and after LLM output. This granular control is crucial for complex scenarios and ensures that every interaction step is validated.

💡

System Design Implication

The choice between managed and self-orchestrated guardrails is a trade-off between time-to-market/operational overhead and customization/security posture. For highly sensitive applications or those requiring unique validation logic, a self-orchestrated approach with custom guardrail placement often provides a stronger security stance.

Preventing Indirect Prompt Injection

The article demonstrates the importance of guardrail placement using an indirect prompt injection scenario. In such attacks, malicious instructions are embedded not in the user's direct prompt, but within external data sources (e.g., a document retrieved by a RAG system). Effective mitigation requires guardrails to inspect not only the direct user prompt but also all contextual information fed to the LLM. For a self-orchestrated agent, this means implementing guardrails *before* information retrieval, *on* the retrieved information, and *before* sending the augmented prompt to the LLM.

AI agentsLLMGuardrailsPrompt InjectionSystem SecurityCloud ArchitectureAmazon BedrockDatadog

Comments

Loading comments...

Architecture Design

Design this yourself

Design an AI agent system that incorporates robust, multi-stage security guardrails to mitigate indirect prompt injection and other LLM-specific vulnerabilities. Focus on the placement and implementation strategies for these guardrails within a self-orchestrated agent framework, including validation of user input, retrieved context, and LLM output before tool execution.

Practice Interview

Focus: security guardrails for AI agents

Other design angles

· Design a security proxy layer specifically for LLM interactions that can apply guardrails to both direct prompts and augmented context from external sources.· Design a data pipeline for a RAG system that includes a dedicated guardrail service for sanitizing and validating retrieved documents before they are used to augment LLM prompts.· Design a multi-tenant AI agent platform where guardrail policies can be defined and enforced at both the platform level and per-tenant level.