Menu
Dev.to #systemdesign·May 18, 2026

Zero-Downtime Database Migrations with Expand-Contract Pattern

This article discusses the critical challenges of performing database schema migrations without downtime in distributed systems, where backward compatibility is essential. It thoroughly explains the expand-contract pattern as a canonical approach to safely evolve schemas, detailing each phase (expand, migrate, contract) and providing specific techniques for various DDL operations like column, index, and table changes. The article emphasizes the importance of orchestration, automation, monitoring, and idempotency for successful production deployments.

Read original on Dev.to #systemdesign

The Challenge of Zero-Downtime Migrations

Database migrations are high-risk operations in a distributed system, especially when striving for zero-downtime deployments. Schema changes can lead to table locks, broken queries, or data corruption. The core challenge lies in maintaining backward compatibility when multiple service instances run concurrently with different code versions (old and new) during deployment. All migrations must support both schema versions simultaneously.

The Expand-Contract Pattern

The expand-contract pattern is the industry-standard approach for achieving zero-downtime database migrations. It systematically breaks down schema evolution into three distinct phases, ensuring that each step is reversible and safe, particularly for rollbacks.

  1. Expand Phase: New schema elements (columns, tables, indexes) are added without modifying or removing existing ones. New columns should be nullable or have default values. Old code continues to operate on the old schema, while new code can write to both old and new elements.
  2. Migration Phase: Data is transitioned from old to new schema elements, typically via a backfilling script. This phase should be executed in batches, allow for pausing/resuming, and minimize locks using database-specific online DDL tools (e.g., PostgreSQL `pg_repack`, MySQL `pt-online-schema-change`).
  3. Contract Phase: Old, no longer needed schema elements are removed. This phase is executed only after all application instances have been updated to use the new schema exclusively. This phase is generally irreversible and requires thorough verification (code audit, monitoring of query plans) to ensure no dependencies remain.

Specific DDL Operation Considerations

The article details specific approaches for common DDL operations:

  • Column Additions: Add as nullable first, backfill data, then add `NOT NULL` constraint. Modern databases like PostgreSQL can add `NOT NULL` with a default without table rewrites.
  • Index Changes: Use concurrent index creation (e.g., `CREATE INDEX CONCURRENTLY` in PostgreSQL) to avoid write locks. Verify index usage with `EXPLAIN` plans. Drop indexes only after confirming no query patterns depend on them.
  • Table Changes: Renaming tables requires a two-phase approach (view/synonym, rename, update code). Splitting tables involves creating new tables, dual-writes, backfilling, and then switching reads.
  • Foreign Key Additions: In high-traffic systems, add with `NOT VALID` constraint and then validate asynchronously in the background to avoid locking.
💡

Idempotency and Monitoring

Production migrations require robust orchestration and automation using tools like Flyway, Liquibase, or Alembic. Migration scripts must be idempotent, meaning they can be run multiple times without adverse effects. During execution, critical monitoring (CPU, replication lag, lock contention, query latency) is essential, with alerts and a clear rollback plan in case of performance degradation.

database migrationzero-downtimeexpand-contract patternDDLbackward compatibilitydistributed systemsDevOpsschema evolution

Comments

Loading comments...