🔵Meta Engineering·February 11, 2026

Just-in-Time Testing with LLMs for Agentic Development

This article introduces Just-in-Time Tests (JiTTests), a novel approach to automated software testing where Large Language Models (LLMs) generate tests on-the-fly for specific code changes. This method aims to address the challenges of traditional testing in the era of rapid agentic development by eliminating manual test authoring, maintenance, and review, thereby accelerating the detection of regressions.

DevOps & SRE AI & ML Infrastructure Tools & Frameworks

Read original on Meta Engineering

The rapid adoption of agentic software development, where AI agents contribute significantly to code generation and delivery, has exposed limitations in traditional software testing paradigms. Manual test creation and maintenance struggle to keep pace with the increased velocity of code changes, leading to inefficiencies, high false positive rates, and significant operational overhead. This article proposes JiTTesting as a solution to this evolving challenge.

The Paradigm Shift to Just-in-Time Testing

JiTTests represent a fundamental departure from static, manually authored test suites. Instead of maintaining a persistent collection of tests, JiTTests are dynamically generated and executed in response to each specific code change (e.g., a pull request). This on-demand generation, powered by LLMs, allows tests to be highly tailored and relevant to the immediate context of the change, significantly reducing the likelihood of false positives and eliminating the burden of test maintenance.

How Catching JiTTests Work

New code lands in the codebase.
The system infers the intention of the code change, often leveraging LLMs.
It creates mutants (code versions with deliberately inserted faults) to simulate potential issues.
LLMs generate and run targeted tests to catch these simulated faults.
Ensembles of rule-based and LLM-based assessors analyze test results to identify true positive failures, minimizing noise.
Engineers receive clear, actionable reports on unexpected behavior changes.

ℹ️

Key Advantages for System Design

From a system design perspective, JiTTesting introduces a highly automated, adaptive, and scalable testing infrastructure. It shifts the burden of test creation and maintenance from human engineers to AI, allowing faster feedback loops and improving the overall efficiency of the development pipeline, especially in large-scale, continuously evolving systems. This impacts CI/CD pipelines, developer experience, and the overall robustness of large codebases.

Architectural Implications

Implementing a JiTTesting system requires a robust architecture capable of integrating LLMs, code analysis tools, and mutation testing frameworks into the CI/CD pipeline. Key considerations include the performance of LLM inference for test generation, the efficiency of mutant creation and execution, and the design of the assessment engine to accurately identify true positives without introducing significant latency to the development workflow. This setup also implies significant infrastructure for managing test environments and computational resources for LLM execution.

This approach highlights the increasing convergence of AI and software engineering, particularly in the realm of development tooling and infrastructure. Designing such a system necessitates careful consideration of data pipelines for code analysis, model serving for LLMs, and intelligent feedback mechanisms to refine test generation over time.

automated testingLLMsagentic developmentCI/CDsoftware qualitytesting infrastructureMeta Engineeringmutation testing

Comments

Loading comments...

Architecture Design

Design this yourself

Design a highly scalable and resilient AI-powered just-in-time test generation system for a large monorepo, integrating LLMs for test generation and analysis into a CI/CD pipeline. Focus on the architecture for inferring code change intent, generating targeted mutant versions, dynamically running tests, and providing actionable feedback to engineers with minimal false positives.

Focus: AI-powered just-in-time test generation system

Other design angles

· Design the data pipelines and machine learning infrastructure required to train and serve the LLMs for a JiTTesting system.· Design a feedback loop mechanism within a JiTTesting system to continuously improve the accuracy and relevance of generated tests over time.· Architect a distributed system for executing a large volume of dynamically generated tests in parallel across diverse testing environments.