Meta developed a unified AI agent platform to automate finding and fixing performance issues across its vast infrastructure, enabling significant power savings and freeing up engineering time. This platform uses a two-layered architecture of standardized tools and encoded domain expertise (skills) to tackle both proactive optimization (offense) and reactive regression mitigation (defense). By centralizing these capabilities, Meta has built a self-sustaining efficiency engine that scales without proportionally increasing headcount, recovering hundreds of megawatts of power.
Read original on Meta EngineeringOperating at Meta's scale, where code serves over 3 billion people, means even a tiny performance regression (e.g., 0.1%) translates into substantial power consumption and increased infrastructure costs. Traditional manual methods for identifying, root-causing, and resolving these issues become a significant bottleneck, consuming valuable engineering time that could otherwise be spent on innovation. This problem manifests in two primary areas: proactively finding optimization opportunities (offense) and reactively mitigating performance regressions (defense).
Meta's solution is a unified AI agent platform designed to automate both offensive and defensive efficiency tasks. The key architectural insight was recognizing that both problems share a similar structure: gather context, apply domain expertise, and create a resolution. This allowed for a single platform built on two distinct layers:
Architectural Insight: Separation of Concerns
The separation of generic 'Tools' (data access/action execution) from specialized 'Skills' (domain-specific logic/reasoning) is a crucial architectural decision. This modularity allows for the reuse of tools across different efficiency problems and simplifies the process of adding new skills as domain expertise evolves. It transforms a generalized LLM into a domain-expert agent.
The same underlying platform powers both proactive optimization and reactive regression handling:
The Capacity Efficiency Program has recovered hundreds of megawatts of power. Beyond direct savings, it fundamentally shifts how engineers approach efficiency. Instead of time-consuming manual investigations, engineers review AI-generated analyses and code, enabling faster deployment of high-impact fixes. The unified architecture promotes compounding returns; new capabilities like conversational assistants, capacity planning agents, and personalized recommendations can be built by composing existing tools with new skills, minimizing data integration overheads and accelerating innovation.