Menu
Datadog Blog·May 22, 2026

Securing AI Agents: Guardrail Placement in Self-Orchestrated vs. Managed Solutions

This article explores the architectural decision of where to place security guardrails in AI agent systems, comparing Amazon Bedrock Agents (managed) with self-orchestrated agents using custom solutions like Datadog AI Guard. It highlights the trade-offs between tightly coupled, vendor-managed guardrails and flexible, custom-implemented ones, particularly in mitigating indirect prompt injection.

Read original on Datadog Blog

Architectural Choices for AI Agent Guardrails

Securing AI agents is a critical system design challenge. A core decision involves determining where in the agent's workflow to implement guardrails. This article contrasts two primary architectural patterns: using a fully managed AI agent service (like Amazon Bedrock Agents) where guardrails are often integrated by the vendor, versus building a self-orchestrated agent where guardrails are custom-implemented at various stages.

Managed vs. Self-Orchestrated Guardrails

Managed AI Agent Services (e.g., Amazon Bedrock Agents) often provide built-in guardrails. This simplifies development and deployment but can limit customization and observability. The guardrails are typically applied pre-processing (on user input) and post-processing (on LLM output). While convenient, these may not cover all intermediate steps or allow for highly specific, domain-aware logic. The vendor controls the guardrail's implementation and placement.

Self-Orchestrated Agents offer greater flexibility. With custom orchestration frameworks, developers can strategically place guardrails at multiple points: before sending user input to the LLM, after retrieving information from a RAG (Retrieval Augmented Generation) system, before tool execution, and after LLM output. This granular control is crucial for complex scenarios and ensures that every interaction step is validated.

💡

System Design Implication

The choice between managed and self-orchestrated guardrails is a trade-off between time-to-market/operational overhead and customization/security posture. For highly sensitive applications or those requiring unique validation logic, a self-orchestrated approach with custom guardrail placement often provides a stronger security stance.

Preventing Indirect Prompt Injection

The article demonstrates the importance of guardrail placement using an indirect prompt injection scenario. In such attacks, malicious instructions are embedded not in the user's direct prompt, but within external data sources (e.g., a document retrieved by a RAG system). Effective mitigation requires guardrails to inspect not only the direct user prompt but also all contextual information fed to the LLM. For a self-orchestrated agent, this means implementing guardrails *before* information retrieval, *on* the retrieved information, and *before* sending the augmented prompt to the LLM.

AI agentsLLMGuardrailsPrompt InjectionSystem SecurityCloud ArchitectureAmazon BedrockDatadog

Comments

Loading comments...