Skip to content

Add application-consistent backup procedures for stateful services #265

@accuser

Description

@accuser

Context

Current backup strategy relies on Incus volume snapshots (make backup-snapshot) and volume exports (make backup-export). While these provide filesystem-level consistency, stateful services like PostgreSQL require application-level backup procedures for guaranteed data consistency.

Problem

Volume snapshots capture disk state at a point in time, but:

  • PostgreSQL: A snapshot during an active transaction may produce a corrupt backup. pg_dump provides a logically consistent export.
  • Forgejo: Git repositories on disk may be mid-write during snapshot. Forgejo's forgejo dump creates a consistent backup.
  • step-ca: The CA database (Badger) should be cleanly exported rather than snapshot mid-operation.

The existing #224 covers backup verification and restore testing, but not the backup method itself.

Proposed Solution

1. Application-level backup targets

make backup-postgresql ENV=cluster01    # pg_dump all databases
make backup-forgejo ENV=cluster01       # forgejo dump
make backup-consistent ENV=cluster01    # All app-level backups

2. Implementation approach

PostgreSQL:

incus exec postgresql01 -- pg_dumpall --clean > backups/postgresql-$(date +%Y%m%d).sql

Forgejo:

incus exec forgejo01 -- forgejo dump -c /etc/forgejo/app.ini
incus file pull forgejo01/tmp/forgejo-dump-*.zip backups/

3. Complement, don't replace, volume snapshots

  • Volume snapshots remain the primary fast-recovery mechanism
  • Application-level backups serve as a secondary, portable, verified backup
  • Both should run on schedule (snapshots more frequently, app-level daily)

Acceptance Criteria

  • make backup-postgresql target for pg_dump
  • make backup-forgejo target for Forgejo dump
  • make backup-consistent target that runs all application-level backups
  • Backup output stored in a consistent location with date-stamped filenames
  • Documentation on restore procedures for each service

Relates To

Priority

Medium — risk increases as data accumulates in PostgreSQL/Forgejo

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestinfrastructureInfrastructure deployment and management

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions