Skip to content

feat: Add built-in backup verification and integrity testing #21283

@dmzoneill

Description

@dmzoneill

Feature Description

Add a backup verification mechanism that validates the integrity and recoverability of etcd backups before they're needed for disaster recovery.

Motivation

  • Backups are critical for etcd cluster recovery
  • Discovering backup corruption during an actual disaster is catastrophic
  • Current etcdctl snapshot save doesn't verify backup recoverability
  • Organizations need confidence their backups are valid before disaster strikes

Proposed Solution

Extend etcdctl with backup verification commands:

# Verify backup integrity
etcdctl snapshot verify backup.db

# Test backup restore (dry-run)
etcdctl snapshot restore --dry-run backup.db

# Full validation with consistency checks
etcdctl snapshot validate backup.db --full

The verification should:

  1. Check file integrity (checksums, corruption detection)
  2. Validate raft metadata and cluster state consistency
  3. Verify all keys are readable
  4. Test restore process in isolated environment
  5. Report any inconsistencies or corruption

Benefits

  • Catch backup corruption early, not during disaster recovery
  • Automated backup validation in CI/CD pipelines
  • Confidence in disaster recovery procedures
  • Compliance with backup testing requirements

Use Cases

  • Automated daily backup validation
  • Pre-migration backup verification
  • Compliance audits requiring tested backups
  • Disaster recovery drills

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions