Table of Contents
- What Are Snapshots?
- What Are Backups?
- Core Differences: Snapshots vs. Backups
- Use Cases: When to Use Snapshots vs. Backups
- Linux Snapshot Tools: A Deep Dive
- Linux Backup Tools: A Deep Dive
- Best Practices: Combining Snapshots and Backups
- Conclusion
- References
What Are Snapshots?
A snapshot is a point-in-time “copy” of a storage volume (e.g., a filesystem, logical volume, or disk partition) that captures the state of the data at the moment the snapshot is created. Unlike a full copy, snapshots are space-efficient because they use copy-on-write (CoW) or redirect-on-write (RoW) mechanisms to avoid duplicating unchanged data.
How Snapshots Work:
- Copy-on-Write (CoW): When a snapshot is created, the original volume and snapshot share the same data blocks. Only when data on the original volume is modified does the snapshot store the old version of the changed block. This ensures minimal initial storage usage.
- Redirect-on-Write (RoW): New writes to the original volume are redirected to a new location, leaving the original data (snapshot) untouched. This avoids copying old blocks entirely.
Key Traits of Snapshots:
- Dependent on the Original Volume: Snapshots live on the same storage device as the original data. If the underlying disk fails, both the original data and snapshots are lost.
- Short-Term Use: Designed for temporary needs (e.g., testing changes, rolling back after a failed update).
- Fast Creation/Rollback: Snapshots are created in seconds (no full data copy) and allow near-instant rollbacks to the snapshot state.
What Are Backups?
A backup is a full or incremental copy of data stored on a separate, independent storage device (e.g., external HDD, network-attached storage (NAS), or cloud storage). Backups are designed for long-term data retention and disaster recovery.
Types of Backups:
- Full Backup: Copies all data from the source to the backup location (large storage footprint, slow to create but fast to restore).
- Incremental Backup: Copies only data changed since the last backup (smaller, faster, but requires the last full backup + all incrementals to restore).
- Differential Backup: Copies data changed since the last full backup (balances size and restore speed).
Key Traits of Backups:
- Independent of Original Data: Stored externally, so they survive disk failures, ransomware, or accidental deletion of the original data.
- Long-Term Use: Retained for days, months, or years (e.g., compliance, archiving).
- Slower Creation/Restore: Full backups take time to create, and restoring large datasets may require hours (depending on size/medium).
Core Differences: Snapshots vs. Backups
To clarify, here’s a side-by-side comparison:
| Feature | Snapshots | Backups |
|---|---|---|
| Purpose | Short-term rollback; quick recovery from errors | Long-term retention; disaster recovery |
| Storage Location | Same device as original data | Separate device (external, NAS, cloud) |
| Storage Efficiency | High (CoW/RoW; minimal initial space) | Lower (full/incremental copies) |
| Recovery Speed | Near-instant (rollback to snapshot state) | Slower (depends on backup size/medium) |
| Dependence on Original Data | Dependent (fails if original data is lost) | Independent (survives original data loss) |
| Use Case Duration | Temporary (hours/days) | Permanent/Long-term (weeks/months/years) |
| Example Scenario | Roll back a failed software update | Restore data after a hard drive crash |
Use Cases: When to Use Snapshots vs. Backups
Use Snapshots When:
- Accidental File Deletion/Modification: You deleted a critical file 10 minutes ago—roll back to a snapshot taken this morning.
- Testing Changes: Before updating a server config or installing software, take a snapshot to revert if something breaks.
- Temporary State Capture: Freeze the state of a development environment for debugging.
Use Backups When:
- Disaster Recovery: A hard drive fails, or ransomware encrypts your data—restore from an offsite backup.
- Archiving: Store old project files or logs for compliance (e.g., 7-year retention for financial data).
- Data Migration: Move data to a new server by restoring a backup to the new hardware.
- Ransomware Protection: Since snapshots live on the same disk, ransomware can encrypt them too—backups on isolated storage are safe.
Linux Snapshot Tools: A Deep Dive
Linux offers robust snapshot tools, often tied to its advanced filesystems or volume managers. Let’s explore the most popular options.
LVM Snapshots
The Logical Volume Manager (LVM) is a Linux utility for managing disk volumes. LVM snapshots use CoW to capture the state of a logical volume (LV) at creation time.
How It Works:
- A snapshot LV is created with a fixed size (e.g., 10GB) to store changed blocks from the original LV.
- When data on the original LV is modified, the old block is copied to the snapshot LV before the new data is written (CoW).
Example Workflow:
- Create a snapshot of an LV named
rootvol:lvcreate --size 10G --snapshot --name root_snap /dev/vg0/rootvol - Mount the snapshot to inspect files:
mount /dev/vg0/root_snap /mnt/snap - Roll back to the snapshot (destroys the snapshot):
lvconvert --merge /dev/vg0/root_snap
Limitations:
- Snapshot size is fixed; if it fills up (due to too many changes), it becomes invalid.
- Performance may degrade as the snapshot grows (more CoW operations).
Btrfs Snapshots
Btrfs (B-tree Filesystem) is a modern, copy-on-write filesystem with built-in snapshot support for subvolumes (independent filesystems within Btrfs).
Key Features:
- Subvolume Snapshots: Snapshots are created at the subvolume level (not entire disks).
- Read-only snapshots: Prevent accidental modification.
- Read-write snapshots: Allow changes (useful for testing).
- Incremental Transfers: Use
btrfs send/receiveto push incremental snapshots to a backup server.
Example Workflow:
- Create a read-only snapshot of subvolume
@home:btrfs subvolume snapshot -r /mnt/btrfs/@home /mnt/btrfs/@home_snap_20240101 - Send a snapshot to a remote NAS (incremental):
btrfs send /mnt/btrfs/@home_snap_20240101 | ssh user@nas "btrfs receive /backup/btrfs_snaps"
Advantages:
- No fixed size limits (grows dynamically with changes).
- Native integration with Btrfs subvolumes (no separate tooling needed).
ZFS Snapshots
ZFS (Zettabyte File System) is a powerful filesystem/volume manager with enterprise-grade snapshot capabilities. Like Btrfs, ZFS uses CoW and integrates snapshots with datasets (ZFS’s equivalent of subvolumes).
Key Features:
- Incremental Snapshots: Only store changes between snapshots (e.g.,
tank/home@20240101andtank/home@20240102share unchanged data). - Dataset-Level Snapshots: Snapshots are scoped to datasets, making them easy to manage.
- Clones: Create writable copies of snapshots (useful for testing).
Example Workflow:
- Create a snapshot of dataset
tank/home:zfs snapshot tank/home@20240101 - List snapshots:
zfs list -t snapshot - Restore a file from a snapshot:
cp /tank/home/.zfs/snapshot/20240101/lost_file.txt /tank/home/
Advantages:
- Snapshots are atomic (no partial snapshots if the system crashes).
- Integration with ZFS’s data integrity features (checksums, RAID-Z).
Linux Backup Tools: A Deep Dive
Linux offers a rich ecosystem of backup tools, from simple command-line utilities to enterprise-grade solutions.
rsync: The Workhorse
rsync is a command-line tool for file-level synchronization and backup. It’s lightweight, widely available, and supports incremental backups via --link-dest (hard links to unchanged files).
Key Features:
- Incremental Backups: Copies only changed files.
- Delta Transfer: Sends only the changed parts of files (reduces bandwidth).
- Flexibility: Works with local files, SSH, or rsync daemon (network backups).
Example: Daily Incremental Backup to NAS
#!/bin/bash
BACKUP_SRC="/home/user"
BACKUP_DEST="user@nas:/backup/rsync_daily"
LATEST_LINK="$BACKUP_DEST/latest"
# Create backup with hard links to the latest backup (incremental)
rsync -av --link-dest="$LATEST_LINK" "$BACKUP_SRC" "$BACKUP_DEST/$(date +%Y%m%d)"
# Update the "latest" symlink
rm -f "$LATEST_LINK"
ln -s "$BACKUP_DEST/$(date +%Y%m%d)" "$LATEST_LINK"
borgbackup: Deduplication & Encryption
borgbackup (or Borg) is a deduplicating backup tool designed for security and efficiency. It’s ideal for users needing encrypted, space-efficient backups.
Key Features:
- Deduplication: Eliminates redundant data (e.g., multiple backups of the same file).
- Encryption: AES-256 encryption for backups (protects data in transit/at rest).
- Compression: Reduces backup size (zlib, LZ4, or zstd).
Example: Create an Encrypted Backup
# Initialize a borg repository (encrypted)
borg init --encryption=repokey /backup/borg_repo
# Create a backup of /home/user (deduplicated, compressed)
borg create --compression zstd /backup/borg_repo::"backup-{now:%Y%m%d}" /home/user
Timeshift: System Restore Made Easy
Timeshift is a GUI/CLI tool for system backups, inspired by Windows System Restore. It’s popular for desktop users needing simple rollbacks after updates or config changes.
How It Works:
- Uses rsync (for non-Btrfs systems) or Btrfs snapshots (for Btrfs systems) to create periodic system backups.
- Stores backups on a separate partition or external drive.
- Restores the entire system to a previous state (e.g., before a failed update).
Example CLI Usage:
# Create a backup (rsync mode)
timeshift --create --comments "Before updating kernel" --tags D
# List backups
timeshift --list
# Restore from backup (interactive)
timeshift --restore
Enterprise-Grade Tools: Amanda/BackupPC
For large organizations, tools like Amanda (Advanced Maryland Automatic Network Disk Archiver) or BackupPC offer centralized, network-wide backup management:
- Amanda: Supports tape, disk, and cloud backups; scales to thousands of clients.
- BackupPC: Open-source, deduplicates across clients, and runs on Linux servers.
Best Practices: Combining Snapshots and Backups
Snapshots and backups are complementary, not competing. Here’s how to use them together:
-
Snapshots for Short-Term Safety:
- Take snapshots before risky operations (e.g.,
apt upgrade, editing critical configs). - Retain snapshots for 1–7 days (e.g., LVM/Btrfs snapshots on the local disk).
- Take snapshots before risky operations (e.g.,
-
Backups for Long-Term Security:
- Run daily incremental backups to an external NAS or cloud (e.g., borgbackup + rsync).
- Store monthly full backups offsite (e.g., AWS S3, encrypted external HDD).
-
Test Restores Regularly:
- Verify snapshots by rolling back to a test environment.
- Restore a random file from backups to ensure data integrity.
-
Automate Everything:
- Use
cronor systemd timers to auto-create snapshots (e.g., daily LVM snapshots). - Schedule backups during off-peak hours (e.g., 2 AM) to avoid performance hits.
- Use
-
Encrypt Backups:
- Use borgbackup’s encryption or LUKS-encrypt external backup drives to protect against theft.
Conclusion
Snapshots and backups serve distinct roles in Linux data protection:
- Snapshots are your “undo button” for short-term mistakes—fast, space-efficient, but dependent on the original disk.
- Backups are your “insurance policy” for disasters—independent, secure, but slower to create/restore.
By combining snapshots (local, temporary) and backups (external, long-term), you create a resilient data protection strategy. Linux offers powerful tools for both—from LVM/Btrfs snapshots to rsync/borgbackup—so choose based on your needs (desktop vs. enterprise, CLI vs. GUI).
Remember: No single tool solves all problems. A layered approach (snapshots + backups + offsite storage) is the best way to safeguard your data.