Menu
Martin Fowler·March 26, 2026

Automated Testing for LLM Specifications in SDD

This article highlights a critical gap in the specification-driven development (SDD) approach for Large Language Models (LLMs): the lack of automated testing for specifications. It emphasizes that while defining desired behavior and constraints for LLMs is good practice, these specifications must be encoded into executable tests to effectively enforce the contract and prevent drift, rather than relying solely on documentation.

Read original on Martin Fowler

The Gap in Specification-Driven Development for LLMs

The adoption of Large Language Models (LLMs) has led to a surge in interest in specification-driven development (SDD). The common advice is to write detailed specifications outlining desired behavior, constraints, and guardrails for LLM agents. This approach aims to provide clarity and direction, much like traditional software specifications.

⚠️

The Specification Trap

Many developers treat the specification document as the primary safety net, but it's merely a blueprint. Without automated tests, there's no reliable mechanism to detect when an LLM's behavior deviates from its intended contract, leading to potential issues in production.

The Need for Executable Specifications and Test Suites

The crucial next step, often overlooked, is to translate these specifications into automated tests that actively enforce the contract. Just as in traditional software development, a specification document provides the *what*, but a robust test suite provides the *proof* that the *what* is being met. This is particularly vital for LLMs, where outputs can be non-deterministic and prone to 'drift' over time or with new prompts.

  • Prevent Drift: LLMs can exhibit unexpected behaviors or 'drift' from intended logic without explicit programming. Automated tests catch these deviations.
  • Ensure Contract Adherence: Guarantees that the LLM continues to meet its defined functional and non-functional requirements.
  • Improve Reliability: Increases confidence in LLM applications by systematically validating their responses against expectations.
  • Facilitate Iteration: Allows for safer experimentation and updates to LLMs or prompts, knowing that core behaviors are still verified.

Implementing a test suite for LLM specifications is analogous to establishing a continuous integration process for code. It provides an immediate feedback loop when changes to prompts, models, or underlying data cause a violation of the specified behavior.

LLMTestingSpecification Driven DevelopmentAISoftware QualityAutomated TestingDevOps

Comments

Loading comments...