Backup & Recovery Strategies¶
Your database is only as reliable as your ability to restore it. A backup you have never tested is not a backup - it is a hope. This guide covers the tools, techniques, and strategies for protecting MySQL and PostgreSQL data against every failure mode from accidental DELETE to total disk loss.
Logical vs Physical Backups¶
Every backup method falls into one of two categories, and understanding the trade-offs determines which tools you reach for.
A logical backup exports data as SQL statements or delimited text. You get a portable, human-readable file that can be loaded into any compatible database version. The trade-off is speed - logical backups read every row through the SQL layer, and restoring means re-executing every INSERT. For a 500 GB database, that can take hours or days.
A physical backup copies the raw data files (tablespace files, WAL segments, redo logs) at the filesystem or storage level. These are fast to create and fast to restore because they skip the SQL layer entirely. The trade-off is portability - a physical backup is tied to the exact database version, architecture, and often the same configuration.
| Factor | Logical | Physical |
|---|---|---|
| Speed (backup) | Slow - reads through SQL | Fast - copies raw files |
| Speed (restore) | Slow - re-executes SQL | Fast - copies files back |
| Size | Smaller (text, compresses well) | Larger (raw pages, internal fragmentation) |
| Portability | High - works across versions and platforms | Low - same version and architecture |
| Granularity | Per-table, per-database, or full | Usually full instance (some tools support incremental) |
| Consistency | Snapshot via transactions or locking | Requires coordination (flush, freeze, or hot backup tool) |
| PITR support | No (without binary log/WAL) | Yes (with log replay) |
When to use each:
- Logical: migrating between versions, exporting specific tables, small-to-medium databases, cross-platform transfers
- Physical: large databases where restore speed matters, disaster recovery, setting up replicas, when you need point-in-time recovery
Most production environments use both - nightly physical backups for fast disaster recovery, plus periodic logical backups for portability and schema versioning.
MySQL Backup Tools¶
mysqldump¶
mysqldump is the standard logical backup tool shipped with MySQL. It produces SQL output that recreates the schema and data.
A basic full backup:
The flags that matter:
| Flag | Purpose |
|---|---|
--single-transaction |
Takes a consistent snapshot using a transaction (InnoDB only, no table locks) |
--routines |
Includes stored procedures and functions |
--triggers |
Includes triggers (on by default in MySQL 5.7+, but be explicit) |
--events |
Includes scheduled events |
--all-databases |
Dumps every database |
--flush-logs |
Rotates binary logs after the dump, useful for PITR |
--source-data=2 |
Records the binary log position as a comment (MySQL 8.0+, replaces --master-data) |
--set-gtid-purged=OFF |
Prevents GTID-related errors when restoring to a non-GTID server |
MyISAM and --single-transaction
--single-transaction only provides consistency for InnoDB tables. If your database contains MyISAM tables, you need --lock-all-tables instead, which blocks all writes during the dump. On modern MySQL, you should be using InnoDB for everything.
Restoring a mysqldump backup:
For large restores, disable checks to speed up the import:
mysql -e "SET GLOBAL innodb_flush_log_at_trx_commit=2; SET GLOBAL sync_binlog=0;"
mysql < full_backup.sql
mysql -e "SET GLOBAL innodb_flush_log_at_trx_commit=1; SET GLOBAL sync_binlog=1;"
mydumper and myloader¶
mydumper is a third-party tool that parallelizes logical backups. Where mysqldump uses a single thread, mydumper can dump multiple tables simultaneously and splits large tables into chunks. The companion tool myloader restores in parallel.
# Parallel backup with 4 threads
mydumper --threads 4 --compress --outputdir /backup/mysql/daily
# Parallel restore
myloader --threads 4 --directory /backup/mysql/daily
For a 100 GB database, mydumper/myloader can be 4-8x faster than mysqldump/mysql depending on hardware and table count.
mysqlpump¶
mysqlpump is MySQL's built-in parallel dump tool (introduced in 5.7). It supports parallel processing and progress reporting but has seen less adoption than mydumper and was deprecated in MySQL 8.0.3 in favor of MySQL Shell's dump utilities.
PostgreSQL Backup Tools¶
pg_dump¶
pg_dump is the standard logical backup tool for PostgreSQL. Unlike mysqldump, it supports multiple output formats.
| Format | Flag | Description |
|---|---|---|
| Plain SQL | -Fp |
SQL text output (default), restored with psql |
| Custom | -Fc |
Compressed, supports selective restore, restored with pg_restore |
| Directory | -Fd |
One file per table, supports parallel dump and restore |
| Tar | -Ft |
Tar archive, restored with pg_restore |
The custom format (-Fc) is the best default choice - it compresses automatically, supports selective table restore, and allows parallel restore with pg_restore -j.
# Custom format (recommended)
pg_dump -Fc mydb > mydb.dump
# Directory format with parallel dump (4 jobs)
pg_dump -Fd -j 4 mydb -f /backup/mydb_dir
# Plain SQL (human-readable)
pg_dump -Fp mydb > mydb.sql
Restoring:
# From custom or tar format
pg_restore -d mydb mydb.dump
# Parallel restore (4 jobs, custom or directory format only)
pg_restore -d mydb -j 4 mydb.dump
# From plain SQL
psql -d mydb -f mydb.sql
# Restore specific tables only
pg_restore -d mydb -t orders -t customers mydb.dump
pg_dumpall¶
pg_dumpall dumps the entire cluster including roles, tablespaces, and other global objects that pg_dump does not capture.
# Full cluster dump (roles + all databases)
pg_dumpall > cluster_backup.sql
# Globals only (roles and tablespaces, no data)
pg_dumpall --globals-only > globals.sql
A common pattern is to use pg_dumpall --globals-only for roles and pg_dump -Fc per database for data. This gives you the flexibility of custom format for data while still capturing global objects.
pg_basebackup¶
pg_basebackup creates a physical backup by streaming a copy of the entire PostgreSQL data directory. This is the foundation for setting up streaming replicas and for physical backup strategies.
# Stream a physical backup to a directory
pg_basebackup -D /backup/pg_base -Fp -Xs -P -R
# Create a compressed tar backup
pg_basebackup -D /backup/pg_base -Ft -Xf -z
| Flag | Purpose |
|---|---|
-D |
Destination directory |
-Fp |
Plain format (data directory copy) |
-Ft |
Tar format |
-Xs |
Stream WAL during backup (recommended) |
-Xf |
Fetch WAL after backup completes |
-z |
Compress output (tar format only) |
-P |
Show progress |
-R |
Write standby.signal and connection settings for replica setup |
Percona XtraBackup¶
Percona XtraBackup creates hot physical backups of InnoDB without locking tables or interrupting transactions. It copies the InnoDB data files while tracking the redo log to capture changes made during the copy. This makes it the standard tool for MySQL physical backups in production.
Full Backup¶
# Create a full backup
xtrabackup --backup --target-dir=/backup/base
# Prepare the backup (apply redo log for consistency)
xtrabackup --prepare --target-dir=/backup/base
# Restore (MySQL must be stopped)
systemctl stop mysql
rm -rf /var/lib/mysql/*
xtrabackup --copy-back --target-dir=/backup/base
chown -R mysql:mysql /var/lib/mysql
systemctl start mysql
The prepare step is critical. A raw XtraBackup is not consistent - it contains data files from different points in time plus redo log entries. The prepare step replays the redo log to bring all data files to a consistent state, just like InnoDB crash recovery.
Incremental Backups¶
XtraBackup supports incremental backups that only copy pages changed since the last backup. This dramatically reduces backup time and storage for large databases.
# Full base backup (Sunday)
xtrabackup --backup --target-dir=/backup/base
# Incremental backup (Monday) - only changed pages
xtrabackup --backup --target-dir=/backup/inc1 \
--incremental-basedir=/backup/base
# Incremental backup (Tuesday)
xtrabackup --backup --target-dir=/backup/inc2 \
--incremental-basedir=/backup/inc1
Restoring an incremental chain requires preparing each increment in order:
# Prepare base (with --apply-log-only to preserve uncommitted transactions)
xtrabackup --prepare --apply-log-only --target-dir=/backup/base
# Apply incremental backups in order
xtrabackup --prepare --apply-log-only --target-dir=/backup/base \
--incremental-dir=/backup/inc1
xtrabackup --prepare --apply-log-only --target-dir=/backup/base \
--incremental-dir=/backup/inc2
# Final prepare (without --apply-log-only)
xtrabackup --prepare --target-dir=/backup/base
# Restore
systemctl stop mysql
rm -rf /var/lib/mysql/*
xtrabackup --copy-back --target-dir=/backup/base
chown -R mysql:mysql /var/lib/mysql
systemctl start mysql
The --apply-log-only flag
Use --apply-log-only during intermediate prepare steps. Without it, XtraBackup rolls back uncommitted transactions, which makes subsequent incremental applies fail. Only omit --apply-log-only on the final prepare step.
Point-in-Time Recovery (PITR)¶
Backups give you a snapshot. Point-in-time recovery fills the gap between the snapshot and the moment of failure (or the moment before an accidental change). PITR replays transaction logs from the backup's position forward to a specific timestamp or log position.
flowchart LR
FB[Full Backup<br/>Sunday 02:00] --> WAL1[WAL Segment 1]
WAL1 --> WAL2[WAL Segment 2]
WAL2 --> WAL3[WAL Segment 3]
WAL3 --> F[Failure<br/>Wednesday 14:30]
FB --> R[Restore Base Backup]
R --> RP[Replay WAL/Binlog<br/>to Target Time]
RP --> REC[Recovered Database<br/>Wednesday 14:29]
MySQL: Binary Log Replay¶
MySQL's binary log records every data-modifying statement or row change. PITR replays these logs with mysqlbinlog.
Step 1: Restore from the most recent full backup.
Step 2: Identify the binary logs created after the backup.
Step 3: Replay binary logs up to the target time (just before the accidental operation).
mysqlbinlog --stop-datetime="2026-02-19 14:30:00" \
/var/lib/mysql/binlog.000042 \
/var/lib/mysql/binlog.000043 | mysql
To skip a specific bad transaction, use --start-position and --stop-position to replay around it:
# Replay everything up to the bad statement
mysqlbinlog --stop-position=12345 /var/lib/mysql/binlog.000042 | mysql
# Skip the bad statement and continue from after it
mysqlbinlog --start-position=12400 /var/lib/mysql/binlog.000042 | mysql
Binary logging must be enabled BEFORE disaster
PITR only works if log_bin is enabled and binary logs exist from the backup point forward. Verify with SHOW VARIABLES LIKE 'log_bin'. If binary logging was not enabled, you can only restore to the backup snapshot - everything after it is lost.
PostgreSQL: WAL Archiving and Recovery¶
PostgreSQL's equivalent is Write-Ahead Log (WAL) archiving. Every change is written to WAL segments before it reaches the data files. By archiving these segments, you can replay them to recover to any point in time.
Step 1: Configure WAL archiving in postgresql.conf:
Step 2: Take a base backup:
Step 3: When recovery is needed, copy the base backup into position and configure recovery:
# Stop PostgreSQL
systemctl stop postgresql
# Replace the data directory with the backup
rm -rf /var/lib/postgresql/16/main
cp -a /backup/pg_base /var/lib/postgresql/16/main
# Configure recovery target
cat >> /var/lib/postgresql/16/main/postgresql.auto.conf <<EOF
restore_command = 'cp /backup/wal_archive/%f %p'
recovery_target_time = '2026-02-19 14:30:00'
EOF
# Create the recovery signal file
touch /var/lib/postgresql/16/main/recovery.signal
# Start PostgreSQL - it will replay WAL to the target time
chown -R postgres:postgres /var/lib/postgresql/16/main
systemctl start postgresql
PostgreSQL replays archived WAL segments using restore_command until it reaches recovery_target_time, then pauses. You verify the data and promote to normal operation:
Backup Verification¶
A backup that has never been restored is an untested assumption. Schedule regular verification to ensure your backups actually work.
Restore to a Test Instance¶
The most reliable verification is a full restore to a separate server or container:
# MySQL: restore to a test instance
mysql -h test-db-host < /backup/mysql_full_20260219.sql
mysql -h test-db-host -e "SELECT COUNT(*) FROM mydb.orders"
# PostgreSQL: restore to a test database
createdb -h test-db-host mydb_test
pg_restore -d mydb_test -h test-db-host /backup/pg_myapp_20260219.dump
psql -h test-db-host -d mydb_test -c "SELECT COUNT(*) FROM orders"
Checksum Validation¶
Verify backup file integrity before and after transfer:
# Generate checksums at backup time
sha256sum /backup/mysql_full_20260219.sql.gz > /backup/mysql_full_20260219.sha256
# Verify after transfer to offsite storage
sha256sum -c /backup/mysql_full_20260219.sha256
For XtraBackup, the --prepare step itself validates consistency. If prepare succeeds, the backup is restorable.
Automated Verification Script¶
#!/bin/bash
# verify_backup.sh - restore and validate the latest backup
BACKUP_FILE=$(ls -t /backup/mysql_full_*.sql.gz | head -1)
TEST_DB="backup_verify_$(date +%Y%m%d)"
# Create test database and restore
mysql -e "CREATE DATABASE ${TEST_DB}"
zcat "${BACKUP_FILE}" | mysql "${TEST_DB}"
# Run validation queries
TABLES=$(mysql -NBe "SELECT COUNT(*) FROM information_schema.tables WHERE table_schema='${TEST_DB}'")
ROWS=$(mysql -NBe "SELECT SUM(table_rows) FROM information_schema.tables WHERE table_schema='${TEST_DB}'")
echo "Backup: ${BACKUP_FILE}"
echo "Tables: ${TABLES}"
echo "Total rows: ${ROWS}"
# Compare against expected minimums
if [ "${TABLES}" -lt 10 ]; then
echo "ALERT: Table count below threshold" >&2
exit 1
fi
# Cleanup
mysql -e "DROP DATABASE ${TEST_DB}"
echo "Verification passed"
Automate verification weekly
Run backup verification on a schedule, not just when you suspect a problem. A cron job that restores to a test instance and checks row counts catches silent backup failures before they become disasters.
Scheduling and Rotation¶
Cron-Based Backup Scripts¶
A production backup script handles compression, logging, error reporting, and cleanup:
#!/bin/bash
# mysql_backup.sh - daily MySQL backup with rotation
set -euo pipefail
BACKUP_DIR="/backup/mysql"
DATE=$(date +%Y%m%d_%H%M%S)
LOG_FILE="${BACKUP_DIR}/backup_${DATE}.log"
RETENTION_DAYS=30
exec > >(tee -a "${LOG_FILE}") 2>&1
echo "Backup started: $(date)"
# Full logical backup
mysqldump --single-transaction --routines --triggers --events \
--all-databases | gzip > "${BACKUP_DIR}/full_${DATE}.sql.gz"
# Generate checksum
sha256sum "${BACKUP_DIR}/full_${DATE}.sql.gz" > "${BACKUP_DIR}/full_${DATE}.sha256"
# Remove backups older than retention period
find "${BACKUP_DIR}" -name "full_*.sql.gz" -mtime +${RETENTION_DAYS} -delete
find "${BACKUP_DIR}" -name "full_*.sha256" -mtime +${RETENTION_DAYS} -delete
find "${BACKUP_DIR}" -name "backup_*.log" -mtime +${RETENTION_DAYS} -delete
echo "Backup completed: $(date)"
echo "Size: $(du -h "${BACKUP_DIR}/full_${DATE}.sql.gz" | cut -f1)"
Schedule with cron:
# Daily at 2:00 AM
0 2 * * * /usr/local/bin/mysql_backup.sh
# Weekly XtraBackup full + daily incrementals
0 3 * * 0 /usr/local/bin/xtrabackup_full.sh
0 3 * * 1-6 /usr/local/bin/xtrabackup_incremental.sh
Grandfather-Father-Son Rotation¶
The GFS rotation strategy balances retention depth with storage costs:
| Level | Frequency | Retention | Purpose |
|---|---|---|---|
| Son (daily) | Every day | 7-14 days | Recent recovery, accidental changes |
| Father (weekly) | Every Sunday | 4-5 weeks | Weekly recovery points |
| Grandfather (monthly) | First of month | 12 months | Monthly recovery, compliance |
| Yearly | January 1 | 3-7 years | Regulatory, legal holds |
A 500 GB database with GFS rotation:
- 14 daily backups: 14 x 500 GB = 7 TB (or ~700 GB compressed)
- 4 weekly: 4 x 500 GB = 2 TB (or ~200 GB compressed)
- 12 monthly: 12 x 500 GB = 6 TB (or ~600 GB compressed)
- Total: ~1.5 TB compressed storage for a full year of recovery points
Cloud and Snapshot Backups¶
Filesystem Snapshots¶
Filesystem snapshots (LVM, ZFS, EBS) create near-instant point-in-time copies of the underlying storage. They are the fastest backup method because they operate below the filesystem level.
See also
Filesystem-level backups often use LVM snapshots. For LVM concepts, snapshot creation, and filesystem management, see Disk and Filesystem.
LVM snapshots:
# Flush tables and acquire read lock
mysql -e "FLUSH TABLES WITH READ LOCK; SYSTEM lvcreate -L 10G -s -n mysql_snap /dev/vg0/mysql_data"
# Release the lock
mysql -e "UNLOCK TABLES"
# Mount and copy the snapshot
mkdir /mnt/snap
mount /dev/vg0/mysql_snap /mnt/snap
rsync -a /mnt/snap/ /backup/mysql_snapshot/
umount /mnt/snap
lvremove -f /dev/vg0/mysql_snap
ZFS snapshots:
# Flush and snapshot (ZFS snapshots are atomic and instant)
mysql -e "FLUSH TABLES WITH READ LOCK"
zfs snapshot tank/mysql@backup_$(date +%Y%m%d)
mysql -e "UNLOCK TABLES"
# Send snapshot to remote for offsite storage
zfs send tank/mysql@backup_20260219 | ssh backup-host zfs recv backup/mysql
AWS EBS snapshots:
# Flush tables
mysql -e "FLUSH TABLES WITH READ LOCK"
# Create EBS snapshot
aws ec2 create-snapshot --volume-id vol-0123456789abcdef0 \
--description "MySQL backup $(date +%Y%m%d)"
mysql -e "UNLOCK TABLES"
Consistency during snapshots
Without flushing tables and acquiring a lock (or using xtrabackup), a filesystem snapshot captures data files in an inconsistent state - some pages may be half-written. InnoDB's crash recovery can fix this on startup (similar to recovering from a power failure), but it is safer to flush first. For PostgreSQL, use pg_backup_start() and pg_backup_stop() (or pg_start_backup()/pg_stop_backup() on versions before 15) to ensure WAL consistency.
Consistency Considerations¶
| Database | Snapshot Method | Consistency Approach |
|---|---|---|
| MySQL/InnoDB | LVM/ZFS/EBS | FLUSH TABLES WITH READ LOCK then snapshot, or use XtraBackup |
| PostgreSQL | LVM/ZFS/EBS | pg_backup_start() / pg_backup_stop() + archive WAL |
| Any | Application-level | Quiesce writes at application layer before snapshot |
Retention Policies¶
How long you keep backups depends on three factors: recovery requirements, regulatory obligations, and storage budget.
Recovery Requirements¶
Define your Recovery Point Objective (RPO) and Recovery Time Objective (RTO):
- RPO: How much data loss is acceptable? An RPO of 1 hour means you need backups or log archiving at least every hour.
- RTO: How quickly must you restore service? An RTO of 30 minutes rules out logical backups for large databases.
Regulatory Requirements¶
Some industries mandate specific retention periods:
| Regulation | Retention Requirement |
|---|---|
| PCI DSS | 1 year minimum for audit logs |
| HIPAA | 6 years for health records |
| SOX | 7 years for financial records |
| GDPR | Only as long as necessary (minimum retention) |
GDPR is the outlier - it requires you to delete data you no longer need, which can conflict with long backup retention. Consult your compliance team.
Storage Cost Optimization¶
Strategies for managing backup storage costs:
- Tiered storage: keep recent backups on fast storage (SSD/NVMe), move older backups to cheaper storage (HDD, object storage like S3 Glacier)
- Incremental chains: store one full backup per week and incrementals daily (reduces storage by 60-80%)
- Compression:
gzipreduces logical backups by 80-90%;zstdoffers better compression ratios at similar speed - Deduplication: tools like
borgbackupor ZFS deduplication eliminate redundant blocks across backup versions
# Example: tiered storage with S3
# Recent backups on local disk
/backup/mysql/daily/ # last 7 days
# Weekly backups pushed to S3 Standard
aws s3 cp /backup/mysql/weekly/ s3://mycompany-db-backups/weekly/ --recursive
# Monthly backups moved to S3 Glacier
aws s3 cp /backup/mysql/monthly/ s3://mycompany-db-backups/monthly/ \
--storage-class GLACIER --recursive
Backup Strategy Design¶
Further Reading¶
- MySQL Backup and Recovery Documentation - official MySQL backup reference covering all built-in tools and strategies
- PostgreSQL Continuous Archiving and Point-in-Time Recovery - official PostgreSQL PITR documentation with WAL archiving setup
- Percona XtraBackup Documentation - full reference for hot InnoDB backups, incremental strategies, and restore procedures
- mydumper Project - parallel MySQL logical backup tool documentation and usage
- pgBackRest - production-grade PostgreSQL backup tool with parallel backup/restore, incremental backups, and S3 support
- Barman by EnterpriseDB - backup and recovery manager for PostgreSQL with remote backup, WAL archiving, and retention policies
Previous: Redis | Next: Database Security | Back to Index