thelinuxvault guide

Snapshots vs Backups: Understanding Linux Options

In the world of Linux system administration, data protection is paramount. Whether you’re a home user managing a personal server or an enterprise admin overseeing critical infrastructure, safeguarding data from accidental deletion, hardware failure, or malicious attacks is non-negotiable. Two terms that often surface in this context are **snapshots** and **backups**—but they are not interchangeable. Snapshots and backups serve distinct purposes, operate differently, and solve unique problems. Confusing them can lead to gaps in your data protection strategy (e.g., relying on snapshots alone for disaster recovery, only to lose everything when a disk fails). This blog demystifies snapshots and backups in the Linux ecosystem, explaining their definitions, core differences, use cases, and the tools available to implement them. By the end, you’ll know when to use each and how to leverage Linux’s robust tools to keep your data safe.

Table of Contents

  1. What Are Snapshots?
  2. What Are Backups?
  3. Core Differences: Snapshots vs. Backups
  4. Use Cases: When to Use Snapshots vs. Backups
  5. Linux Snapshot Tools: A Deep Dive
  6. Linux Backup Tools: A Deep Dive
  7. Best Practices: Combining Snapshots and Backups
  8. Conclusion
  9. References

What Are Snapshots?

A snapshot is a point-in-time “copy” of a storage volume (e.g., a filesystem, logical volume, or disk partition) that captures the state of the data at the moment the snapshot is created. Unlike a full copy, snapshots are space-efficient because they use copy-on-write (CoW) or redirect-on-write (RoW) mechanisms to avoid duplicating unchanged data.

How Snapshots Work:

  • Copy-on-Write (CoW): When a snapshot is created, the original volume and snapshot share the same data blocks. Only when data on the original volume is modified does the snapshot store the old version of the changed block. This ensures minimal initial storage usage.
  • Redirect-on-Write (RoW): New writes to the original volume are redirected to a new location, leaving the original data (snapshot) untouched. This avoids copying old blocks entirely.

Key Traits of Snapshots:

  • Dependent on the Original Volume: Snapshots live on the same storage device as the original data. If the underlying disk fails, both the original data and snapshots are lost.
  • Short-Term Use: Designed for temporary needs (e.g., testing changes, rolling back after a failed update).
  • Fast Creation/Rollback: Snapshots are created in seconds (no full data copy) and allow near-instant rollbacks to the snapshot state.

What Are Backups?

A backup is a full or incremental copy of data stored on a separate, independent storage device (e.g., external HDD, network-attached storage (NAS), or cloud storage). Backups are designed for long-term data retention and disaster recovery.

Types of Backups:

  • Full Backup: Copies all data from the source to the backup location (large storage footprint, slow to create but fast to restore).
  • Incremental Backup: Copies only data changed since the last backup (smaller, faster, but requires the last full backup + all incrementals to restore).
  • Differential Backup: Copies data changed since the last full backup (balances size and restore speed).

Key Traits of Backups:

  • Independent of Original Data: Stored externally, so they survive disk failures, ransomware, or accidental deletion of the original data.
  • Long-Term Use: Retained for days, months, or years (e.g., compliance, archiving).
  • Slower Creation/Restore: Full backups take time to create, and restoring large datasets may require hours (depending on size/medium).

Core Differences: Snapshots vs. Backups

To clarify, here’s a side-by-side comparison:

FeatureSnapshotsBackups
PurposeShort-term rollback; quick recovery from errorsLong-term retention; disaster recovery
Storage LocationSame device as original dataSeparate device (external, NAS, cloud)
Storage EfficiencyHigh (CoW/RoW; minimal initial space)Lower (full/incremental copies)
Recovery SpeedNear-instant (rollback to snapshot state)Slower (depends on backup size/medium)
Dependence on Original DataDependent (fails if original data is lost)Independent (survives original data loss)
Use Case DurationTemporary (hours/days)Permanent/Long-term (weeks/months/years)
Example ScenarioRoll back a failed software updateRestore data after a hard drive crash

Use Cases: When to Use Snapshots vs. Backups

Use Snapshots When:

  • Accidental File Deletion/Modification: You deleted a critical file 10 minutes ago—roll back to a snapshot taken this morning.
  • Testing Changes: Before updating a server config or installing software, take a snapshot to revert if something breaks.
  • Temporary State Capture: Freeze the state of a development environment for debugging.

Use Backups When:

  • Disaster Recovery: A hard drive fails, or ransomware encrypts your data—restore from an offsite backup.
  • Archiving: Store old project files or logs for compliance (e.g., 7-year retention for financial data).
  • Data Migration: Move data to a new server by restoring a backup to the new hardware.
  • Ransomware Protection: Since snapshots live on the same disk, ransomware can encrypt them too—backups on isolated storage are safe.

Linux Snapshot Tools: A Deep Dive

Linux offers robust snapshot tools, often tied to its advanced filesystems or volume managers. Let’s explore the most popular options.

LVM Snapshots

The Logical Volume Manager (LVM) is a Linux utility for managing disk volumes. LVM snapshots use CoW to capture the state of a logical volume (LV) at creation time.

How It Works:

  • A snapshot LV is created with a fixed size (e.g., 10GB) to store changed blocks from the original LV.
  • When data on the original LV is modified, the old block is copied to the snapshot LV before the new data is written (CoW).

Example Workflow:

  1. Create a snapshot of an LV named rootvol:
    lvcreate --size 10G --snapshot --name root_snap /dev/vg0/rootvol  
  2. Mount the snapshot to inspect files:
    mount /dev/vg0/root_snap /mnt/snap  
  3. Roll back to the snapshot (destroys the snapshot):
    lvconvert --merge /dev/vg0/root_snap  

Limitations:

  • Snapshot size is fixed; if it fills up (due to too many changes), it becomes invalid.
  • Performance may degrade as the snapshot grows (more CoW operations).

Btrfs Snapshots

Btrfs (B-tree Filesystem) is a modern, copy-on-write filesystem with built-in snapshot support for subvolumes (independent filesystems within Btrfs).

Key Features:

  • Subvolume Snapshots: Snapshots are created at the subvolume level (not entire disks).
    • Read-only snapshots: Prevent accidental modification.
    • Read-write snapshots: Allow changes (useful for testing).
  • Incremental Transfers: Use btrfs send/receive to push incremental snapshots to a backup server.

Example Workflow:

  1. Create a read-only snapshot of subvolume @home:
    btrfs subvolume snapshot -r /mnt/btrfs/@home /mnt/btrfs/@home_snap_20240101  
  2. Send a snapshot to a remote NAS (incremental):
    btrfs send /mnt/btrfs/@home_snap_20240101 | ssh user@nas "btrfs receive /backup/btrfs_snaps"  

Advantages:

  • No fixed size limits (grows dynamically with changes).
  • Native integration with Btrfs subvolumes (no separate tooling needed).

ZFS Snapshots

ZFS (Zettabyte File System) is a powerful filesystem/volume manager with enterprise-grade snapshot capabilities. Like Btrfs, ZFS uses CoW and integrates snapshots with datasets (ZFS’s equivalent of subvolumes).

Key Features:

  • Incremental Snapshots: Only store changes between snapshots (e.g., tank/home@20240101 and tank/home@20240102 share unchanged data).
  • Dataset-Level Snapshots: Snapshots are scoped to datasets, making them easy to manage.
  • Clones: Create writable copies of snapshots (useful for testing).

Example Workflow:

  1. Create a snapshot of dataset tank/home:
    zfs snapshot tank/home@20240101  
  2. List snapshots:
    zfs list -t snapshot  
  3. Restore a file from a snapshot:
    cp /tank/home/.zfs/snapshot/20240101/lost_file.txt /tank/home/  

Advantages:

  • Snapshots are atomic (no partial snapshots if the system crashes).
  • Integration with ZFS’s data integrity features (checksums, RAID-Z).

Linux Backup Tools: A Deep Dive

Linux offers a rich ecosystem of backup tools, from simple command-line utilities to enterprise-grade solutions.

rsync: The Workhorse

rsync is a command-line tool for file-level synchronization and backup. It’s lightweight, widely available, and supports incremental backups via --link-dest (hard links to unchanged files).

Key Features:

  • Incremental Backups: Copies only changed files.
  • Delta Transfer: Sends only the changed parts of files (reduces bandwidth).
  • Flexibility: Works with local files, SSH, or rsync daemon (network backups).

Example: Daily Incremental Backup to NAS

#!/bin/bash  
BACKUP_SRC="/home/user"  
BACKUP_DEST="user@nas:/backup/rsync_daily"  
LATEST_LINK="$BACKUP_DEST/latest"  

# Create backup with hard links to the latest backup (incremental)  
rsync -av --link-dest="$LATEST_LINK" "$BACKUP_SRC" "$BACKUP_DEST/$(date +%Y%m%d)"  

# Update the "latest" symlink  
rm -f "$LATEST_LINK"  
ln -s "$BACKUP_DEST/$(date +%Y%m%d)" "$LATEST_LINK"  

borgbackup: Deduplication & Encryption

borgbackup (or Borg) is a deduplicating backup tool designed for security and efficiency. It’s ideal for users needing encrypted, space-efficient backups.

Key Features:

  • Deduplication: Eliminates redundant data (e.g., multiple backups of the same file).
  • Encryption: AES-256 encryption for backups (protects data in transit/at rest).
  • Compression: Reduces backup size (zlib, LZ4, or zstd).

Example: Create an Encrypted Backup

# Initialize a borg repository (encrypted)  
borg init --encryption=repokey /backup/borg_repo  

# Create a backup of /home/user (deduplicated, compressed)  
borg create --compression zstd /backup/borg_repo::"backup-{now:%Y%m%d}" /home/user  

Timeshift: System Restore Made Easy

Timeshift is a GUI/CLI tool for system backups, inspired by Windows System Restore. It’s popular for desktop users needing simple rollbacks after updates or config changes.

How It Works:

  • Uses rsync (for non-Btrfs systems) or Btrfs snapshots (for Btrfs systems) to create periodic system backups.
  • Stores backups on a separate partition or external drive.
  • Restores the entire system to a previous state (e.g., before a failed update).

Example CLI Usage:

# Create a backup (rsync mode)  
timeshift --create --comments "Before updating kernel" --tags D  

# List backups  
timeshift --list  

# Restore from backup (interactive)  
timeshift --restore  

Enterprise-Grade Tools: Amanda/BackupPC

For large organizations, tools like Amanda (Advanced Maryland Automatic Network Disk Archiver) or BackupPC offer centralized, network-wide backup management:

  • Amanda: Supports tape, disk, and cloud backups; scales to thousands of clients.
  • BackupPC: Open-source, deduplicates across clients, and runs on Linux servers.

Best Practices: Combining Snapshots and Backups

Snapshots and backups are complementary, not competing. Here’s how to use them together:

  1. Snapshots for Short-Term Safety:

    • Take snapshots before risky operations (e.g., apt upgrade, editing critical configs).
    • Retain snapshots for 1–7 days (e.g., LVM/Btrfs snapshots on the local disk).
  2. Backups for Long-Term Security:

    • Run daily incremental backups to an external NAS or cloud (e.g., borgbackup + rsync).
    • Store monthly full backups offsite (e.g., AWS S3, encrypted external HDD).
  3. Test Restores Regularly:

    • Verify snapshots by rolling back to a test environment.
    • Restore a random file from backups to ensure data integrity.
  4. Automate Everything:

    • Use cron or systemd timers to auto-create snapshots (e.g., daily LVM snapshots).
    • Schedule backups during off-peak hours (e.g., 2 AM) to avoid performance hits.
  5. Encrypt Backups:

    • Use borgbackup’s encryption or LUKS-encrypt external backup drives to protect against theft.

Conclusion

Snapshots and backups serve distinct roles in Linux data protection:

  • Snapshots are your “undo button” for short-term mistakes—fast, space-efficient, but dependent on the original disk.
  • Backups are your “insurance policy” for disasters—independent, secure, but slower to create/restore.

By combining snapshots (local, temporary) and backups (external, long-term), you create a resilient data protection strategy. Linux offers powerful tools for both—from LVM/Btrfs snapshots to rsync/borgbackup—so choose based on your needs (desktop vs. enterprise, CLI vs. GUI).

Remember: No single tool solves all problems. A layered approach (snapshots + backups + offsite storage) is the best way to safeguard your data.

References