Menu
DZone Microservices·April 1, 2026

DocumentDB High Availability on Kubernetes with an Operator

This article explores deploying DocumentDB, a MongoDB-compatible database built on PostgreSQL, with high availability on Kubernetes using a dedicated operator. It details the architecture for local HA, leveraging CloudNativePG for WAL replication and failover, and demonstrates automatic primary failover to ensure minimal downtime for applications.

Read original on DZone Microservices

The article focuses on achieving local high availability (HA) for DocumentDB within a single Kubernetes cluster. DocumentDB, being a MongoDB-compatible database built on PostgreSQL, relies on the underlying PostgreSQL capabilities for its HA mechanisms. The DocumentDB Kubernetes Operator extends the Kubernetes platform with custom resources to manage these database clusters declaratively, automating deployment, scaling, upgrades, and HA scenarios.

High Availability Levels

The DocumentDB Kubernetes Operator offers multiple layers of high availability, addressing various failure domains:

  • Local HA: Deploys multiple database instances (primary + replicas) within a single Kubernetes cluster. It provides automatic failover in seconds, protecting against pod and node failures. This is the focus of the article.
  • Availability Zone Spreading: Configures replicas across different availability zones to survive a full zone outage.
  • Multi-Region HA: Across Azure regions using KubeFleet, employing physical WAL replication with manual failover.
  • Multi-Cloud HA: Across providers like Azure, AWS, and GCP using Istio, also relying on physical WAL replication and manual failover.

Architecture for Local HA

The local HA architecture for DocumentDB on Kubernetes integrates several key components:

  • CloudNativePG (CNPG) Cluster: This forms the foundation, managing PostgreSQL streaming replication (1 primary + N replicas) based on Write-Ahead Log (WAL) and orchestrating automatic failover.
  • DocumentDB Gateway: A sidecar container injected into each PostgreSQL pod. Its role is crucial for MongoDB compatibility, translating the MongoDB wire protocol to PostgreSQL DocumentDB extension calls.
  • Kubernetes Services: A layered service architecture manages different access patterns. Internal PostgreSQL services handle operations, metrics, and backups, while an external Gateway service routes MongoDB client traffic to the current primary by dynamically tracking the `cnpg.io/instanceRole: primary` label.
💡

Automatic Failover Mechanism

When the primary DocumentDB instance fails, CNPG automatically detects this, promotes the most advanced replica to become the new primary, and updates the pod labels. The Kubernetes external Gateway service then automatically updates its routing to point to the new primary, ensuring the external IP remains stable and minimizing application disruption without requiring manual DNS changes.

KubernetesDocumentDBHigh AvailabilityDatabase OperatorPostgreSQLMongoDBCloudNativePGFailover

Comments

Loading comments...