Menu
ByteByteGo·May 11, 2026

Pinterest's Production Ecosystem for AI Agents with Model Context Protocol (MCP)

This article details how Pinterest built a robust production ecosystem around the Model Context Protocol (MCP) to enable AI agents to interact with internal tools like Presto, Spark, and Airflow. It focuses on the architectural decisions and engineering challenges beyond the protocol itself, including a unified deployment pipeline, a central registry, and a two-layer authorization model, which are crucial for scaling AI agent capabilities securely within an enterprise environment.

Read original on ByteByteGo

Pinterest faced the challenge of connecting numerous AI-powered surfaces (chat apps, IDE plugins, chatbots) with a diverse set of internal engineering tools (Presto, Spark, Airflow, ticketing platforms). Without a shared protocol, this would lead to an N x M integration problem, requiring fifty bespoke integrations for five AI surfaces and ten tools. The Model Context Protocol (MCP) was adopted to transform this into an N + M problem, enabling any client to communicate with any server through a standardized interface.

Key Architectural Decisions for the MCP Ecosystem

Pinterest made three critical architectural bets that shaped their MCP ecosystem, each with clear trade-offs:

  1. Cloud-hosted servers, not local ones: While MCP supports local servers, Pinterest prioritized cloud-hosted servers to leverage existing routing and security infrastructure. This introduced latency due to network requests but allowed for consistent authentication, authorization, logging, and monitoring across all servers, rather than relying on individual developer configurations.
  2. Many small servers, not one giant one: Pinterest opted for domain-specific MCP servers (e.g., Presto MCP server for data queries, Spark MCP server for job debugging) instead of a monolithic server. This decision was driven by differing access control requirements and the need to manage the AI model's context window effectively by providing only relevant tool descriptions. The trade-off was increased operational overhead per server.
  3. A unified deployment pipeline: To mitigate the operational burden of many small servers, Pinterest invested in a unified deployment pipeline. This platform abstracts away boilerplate for deployment, scaling, and infrastructure, allowing domain experts to focus on business logic and accelerating the rollout of new MCP servers. Without this, the 'many small servers' bet would have been unsustainable.

The MCP Registry and Two-Layer Authorization Model

At the core of the ecosystem is the MCP registry, acting as a central catalog and governance backbone. It provides both a web UI for human discovery and an API for AI clients to programmatically discover, validate, and check authorization for servers. This registry ensures that only approved production servers are used.

ℹ️

Two-Layer Authorization for Defense in Depth

Given the sensitivity of internal tools, Pinterest implemented a robust two-layer authorization model: 1. Layer 1 (Coarse-grained at Network Edge with Envoy): An OAuth flow generates a JWT for the user. Envoy, a network proxy, validates this JWT and enforces broad access policies (e.g., specific AI applications can talk to certain MCP servers). This acts as a fast, network-level check. 2. Layer 2 (Fine-grained inside each Server): Within each MCP server, a decorator pattern enforces tool-level permissions, checking if a specific user is authorized to invoke a particular tool (e.g., only the Ads engineering group can call `get_revenue_metrics`). For highly sensitive data, business-group gating adds an extra layer of control. This provides nuanced, business-logic-specific permissions.

AI agentsLLM integrationModel Context ProtocolPinterestinternal toolsauthorizationmicroservicesAPI gateway

Comments

Loading comments...