Menu
💬Slack Engineering·December 1, 2025

Architecting AI Agents for Security Investigations at Slack

This article details Slack's architectural approach to building an AI-powered agent system for streamlining security investigations. It focuses on breaking down complex tasks into chained, single-purpose model invocations with structured outputs, and orchestrating multiple AI personas (Director, Expert, Critic) to enhance control, reduce hallucinations, and improve the consistency of investigation results. The design leverages a 'knowledge pyramid' to optimize model cost and an event-driven service architecture for real-time observation and integration.

Read original on Slack Engineering

The Challenge of AI Agent Consistency in Security

Initial prototypes of AI agents for security investigations, while showing promise, suffered from inconsistent performance and a tendency to jump to conclusions. This variability stemmed from relying heavily on a single, lengthy prompt that acted as a guideline rather than a strict control mechanism. The core system design challenge was to move from a 'prompt engineering' approach to a more robust, controlled, and predictable execution flow for AI-driven security analysis.

Multi-Agent Orchestration with Structured Outputs

Slack's solution involves decomposing the complex investigation into a sequence of smaller, well-defined tasks, each executed by a specific AI agent persona. Crucially, each agent's output is enforced with a JSON schema, ensuring structured and predictable results. This fine-grained control over individual steps significantly improves consistency and reliability compared to monolithic prompts. The application then orchestrates these model invocations, passing relevant context between stages.

💡

Structured Output for LLMs

Leveraging structured output (e.g., JSON schema) for Large Language Models (LLMs) is a powerful pattern for building reliable AI-powered systems. It transforms the often-unpredictable free-form text output into a parseable data structure, enabling downstream automation and validation. While beneficial, designers must consider the complexity of the schema, as overly intricate structures can lead to model failures or 'hallucinations' in adhering to the format.

  • <b>Director Agent:</b> Guides the investigation, forms questions for experts, and uses a journaling tool for planning.
  • <b>Expert Agents (Access, Cloud, Code, Threat):</b> Domain-specific agents that generate findings from their respective data sources in response to Director's questions.
  • <b>Critic Agent:</b> A 'meta-expert' that assesses the quality and credibility of expert findings using a defined rubric, providing a weakly adversarial relationship to mitigate hallucinations.

The Knowledge Pyramid for Cost Optimization

To manage the token cost associated with LLM interactions, Slack implemented a 'knowledge pyramid'. Expert agents, at the base, deal with complex, token-intensive data sources. The Critic then condenses these findings, identifying the most interesting and credible. Finally, the Director receives a highly condensed timeline. This allows for strategic use of models: potentially lower-cost models for initial, detailed expert analysis, and higher-cost models for higher-level decision-making and synthesis, optimizing overall operational expenditure.

Service Architecture and Real-time Observability

The production system comprises a simple, scalable architecture: a Hub for API and persistent storage, Workers that process queued investigation tasks and stream events, and a Dashboard for real-time observation and interaction. This event-driven design allows for monitoring investigations as they happen, debugging model invocations, and integrating with existing security detection tools, providing critical operational visibility and control.

AI AgentsLLM ArchitectureSecurity EngineeringSystem DesignOrchestrationMicroservicesStructured OutputObservability

Comments

Loading comments...