Menu
The New Stack·March 24, 2026

Velero and Kubernetes Backup: Architecting Disaster Recovery for Cloud-Native Applications

This article discusses Broadcom's donation of Velero to the CNCF, highlighting its role as a critical open-source tool for Kubernetes cluster backup, restoration, and migration. It addresses the inherent lack of built-in cluster-level backup in Kubernetes and emphasizes Velero's importance for disaster recovery, workload portability, and managing stateful applications in cloud-native environments.

Read original on The New Stack

The Challenge of Kubernetes Backup and Disaster Recovery

Kubernetes, by design, focuses on orchestrating containerized applications, but it does not inherently provide robust cluster-level backup and disaster recovery capabilities out of the box. This gap poses a significant challenge for organizations running stateful applications and needing to ensure data persistence, system resilience, and workload portability across different environments. Traditional enterprise IT backup solutions often struggle to adapt to the dynamic, distributed nature of cloud-native Kubernetes deployments, creating a "pain point" for integrating new cloud-native paradigms with existing enterprise IT strategies.

Velero: A Kubernetes-Native Solution

Velero emerges as a crucial open-source tool specifically designed to address these challenges. It provides Kubernetes-native capabilities for backing up, restoring, and migrating entire Kubernetes clusters and their applications. This includes protecting both cluster-level resources (e.g., Deployments, Services, ConfigMaps) and persistent data (e.g., PersistentVolumes), which are essential for true disaster recovery and workload mobility.

  • Backup: Captures the state of Kubernetes resources and persistent volumes.
  • Restore: Recovers clusters or specific applications to a previous state.
  • Migration: Enables portability of workloads between different Kubernetes clusters or cloud providers.
ℹ️

Why CNCF Governance Matters

The donation of Velero to the CNCF sandbox project is a strategic move that fosters community contributions and neutral governance. This increased community involvement and vendor-neutrality are expected to build greater confidence among organizations in adopting Velero for critical production workloads, ensuring its longevity and broad applicability.

Architectural Implications for Stateful Applications

For architects designing systems with stateful applications on Kubernetes, integrating a tool like Velero is fundamental. It ensures that databases, message queues, and other data-dependent services can withstand failures and be recovered effectively. Furthermore, features like "QD profiles" (performance recipes for OS customization) and "bring your own CNI" models, as mentioned in the VKS 3.6 release, provide deeper control over underlying infrastructure and networking, which are vital for optimizing performance and manageability of stateful workloads in a cloud-native context.

When designing for disaster recovery in Kubernetes, consider not just application pods, but also associated configurations, secrets, PersistentVolumeClaims, and the actual PersistentVolumes. Velero streamlines this complex process, providing a cohesive strategy for resilience.

KubernetesVeleroBackupDisaster RecoveryCloud NativeData MigrationCNCFStateful Applications

Comments

Loading comments...