Menu
Datadog Blog·June 23, 2026

Migrating a Live Routing System with AI-Assisted Refactoring

This article details Datadog's migration of a live routing system from a legacy model to a relational one using AI-assisted refactoring. It highlights the challenges of safely changing critical infrastructure and how a shadow-testing approach, comparing AI-generated and legacy system outputs, ensured reliability during the transition.

Read original on Datadog Blog

Migrating core infrastructure, especially live routing systems, presents significant challenges. The article from Datadog discusses their approach to refactoring a critical routing component. The goal was to transition from an imperative, ad-hoc data model to a more structured, relational representation for improved maintainability and scalability, without disrupting live traffic.

The Challenge: Safe Migration of a Critical Live System

The existing routing system, referred to as the "routing brain," was implemented in Scala and operated on an in-memory graph. While functional, its imperative nature and lack of a formal schema made it difficult to reason about and extend. The migration aimed to represent routing logic using a more robust relational model.

ℹ️

System Design Implication: Data Model Evolution

This case highlights a common system design challenge: evolving fundamental data models in production systems. Moving from an in-memory, graph-like structure to a relational model often implies better query capabilities, schema enforcement, and easier integration with standard tooling, but requires careful transition strategies.

AI-Assisted Refactoring and Shadow Testing

To ensure a safe migration, Datadog employed a technique called AI-assisted refactoring combined with shadow testing. An AI model was trained on the existing routing logic to translate requests into the new relational model. This new model's output was then *shadowed* against the live system's output.

Shadow testing involves running the new system (or component) in parallel with the old system, feeding it copies of live production traffic, and comparing its output or behavior without affecting real users. This allows for validation under realistic load and data patterns before a full cutover. The AI's role was to help bridge the semantic gap between the old and new data models during this process.

migrationrefactoringlive systemshadow testingAI/MLroutingdata modelreliability

Comments

Loading comments...