Table of Contents
- Understanding RAID Basics
- What is RAID?
- Hardware vs. Software RAID
- Key Concepts: Striping, Mirroring, and Parity
- Prerequisites
- Common RAID Levels Explained
- RAID 0 (Striping)
- RAID 1 (Mirroring)
- RAID 5 (Striping with Parity)
- RAID 6 (Striping with Double Parity)
- RAID 10 (1+0: Mirroring + Striping)
- Tools for RAID in Linux:
mdadm - Step-by-Step Implementation: RAID 1 (Mirroring)
- Identify Disks
- Create the RAID Array
- Format and Mount the Array
- Persist Mounts Across Reboots
- Step-by-Step Implementation: RAID 5 (Striping with Parity)
- Verifying and Managing RAID Arrays
- Check Array Status
- Monitor RAID Health
- Replace a Failed Disk
- Grow/Expand an Array
- Troubleshooting Common Issues
- Conclusion
- References
1. Understanding RAID Basics
What is RAID?
RAID is a technology that aggregates multiple physical disks into a single logical storage unit. Its primary goals are:
- Redundancy: Protect data from disk failure (e.g., if one disk fails, data remains accessible).
- Performance: Improve read/write speeds by distributing data across disks (parallelism).
RAID is not a substitute for backups! It protects against disk failure, not accidental deletion, ransomware, or natural disasters. Always back up critical data separately.
Hardware vs. Software RAID
- Hardware RAID: Managed by a dedicated RAID controller (a physical card in the server). The OS sees the RAID array as a single disk. Pros: Offloads work from the CPU. Cons: Expensive; vendor-locked.
- Software RAID: Managed by the OS (e.g., Linux’s
mdadm). Pros: Cost-effective (uses existing disks), flexible, and OS-agnostic. Cons: Uses CPU resources (minimal for modern systems).
This tutorial focuses on software RAID using mdadm.
Key Concepts
- Striping: Data is split into blocks and distributed across disks (e.g., RAID 0, 5). Improves performance but no redundancy.
- Mirroring: Exact copies of data are stored on two or more disks (e.g., RAID 1). Redundancy but no performance gain (extra write overhead).
- Parity: Mathematical error-correcting data stored across disks (e.g., RAID 5, 6). Allows reconstruction of data if a disk fails.
2. Prerequisites
Before starting, ensure you have:
- A Linux system (e.g., Ubuntu, CentOS, Debian). We’ll use Ubuntu 22.04 for examples.
- Multiple disks/partitions: At least 2 for RAID 1, 3 for RAID 5, etc. Use virtual disks (via VirtualBox/VMware) to practice safely.
- Root access: Use
sudoor log in asroot. - Backup: All data on target disks will be erased! Back up critical data first.
- Basic command-line familiarity:
ls,fdisk,mount, etc.
3. Common RAID Levels Explained
RAID 0 (Striping)
- How it works: Splits data across 2+ disks (no parity/mirroring).
- Minimum disks: 2.
- Pros: Fast read/write speeds (parallelism).
- Cons: No redundancy—single disk failure = total data loss.
- Use case: Temporary storage (e.g., video editing scratch disks) where speed matters more than safety.
RAID 1 (Mirroring)
- How it works: Duplicates data across 2+ disks (exact mirrors).
- Minimum disks: 2.
- Pros: 100% redundancy (survives 1 disk failure). Simple to set up.
- Cons: 50% storage overhead (2 disks = 1 disk usable space). No performance gain (writes to both disks).
- Use case: Critical data (e.g., OS, backups) where redundancy is key.
RAID 5 (Striping with Parity)
- How it works: Stripes data + single parity block across 3+ disks. Parity allows reconstruction if 1 disk fails.
- Minimum disks: 3.
- Usable space: (n-1) disks (e.g., 3x1TB disks = 2TB usable).
- Pros: Balance of performance and redundancy. Good read speeds.
- Cons: Write performance slightly slower (due to parity calculation). Vulnerable during rebuilds (second disk failure = data loss).
- Use case: General-purpose storage (e.g., file servers, databases).
RAID 6 (Striping with Double Parity)
- How it works: Stripes data + two parity blocks across 4+ disks. Survives 2 disk failures.
- Minimum disks: 4.
- Usable space: (n-2) disks (e.g., 4x1TB = 2TB usable).
- Pros: Higher redundancy than RAID 5.
- Cons: Slower writes (more parity). Higher cost (4+ disks).
- Use case: Large storage arrays where data loss is catastrophic (e.g., enterprise backups).
RAID 10 (1+0: Mirroring + Striping)
- How it works: Combines RAID 1 (mirroring) and RAID 0 (striping). First mirrors pairs of disks, then stripes across mirrors.
- Minimum disks: 4 (2 mirrored pairs).
- Usable space: 50% (e.g., 4x1TB = 2TB usable).
- Pros: Fast (striping) + redundant (mirroring). Survives 1 failure per mirror pair.
- Cons: Expensive (4+ disks).
- Use case: High-performance, high-reliability systems (e.g., databases, virtualization hosts).
4. Tools for RAID in Linux: mdadm
mdadm (Multiple Device Admin) is the standard tool for managing Linux software RAID. It creates, assembles, monitors, and repairs RAID arrays. Install it via your package manager:
# Ubuntu/Debian
sudo apt install mdadm
# CentOS/RHEL
sudo dnf install mdadm
Key mdadm commands:
mdadm --create: Build a new RAID array.mdadm --detail: Show array status.mdadm --monitor: Watch for disk failures.mdadm --fail/--remove/--add: Manage failed disks.
5. Step-by-Step Implementation: RAID 1 (Mirroring)
We’ll create a RAID 1 array with 2 disks (/dev/sdb and /dev/sdc).
Step 1: Identify Disks
List all disks to confirm their device names (e.g., sdb, sdc):
lsblk # Lists all block devices (disks/partitions)
# OR
sudo fdisk -l | grep "Disk /dev/sd" # Shows disk sizes
Example output:
Disk /dev/sdb: 100 GiB, 107374182400 bytes
Disk /dev/sdc: 100 GiB, 107374182400 bytes
Ensure no important data is on sdb/sdc—we’ll erase them!
Step 2: Create the RAID 1 Array
Use mdadm --create to build the array. We’ll name it /dev/md0 (common convention for mdadm arrays):
sudo mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb /dev/sdc
--create /dev/md0: Create array at/dev/md0.--level=1: RAID 1 (mirroring).--raid-devices=2: Number of disks in the array./dev/sdb /dev/sdc: Disks to include.
Confirm with:
cat /proc/mdstat # Shows RAID status (resync in progress)
Output during resync:
Personalities : [raid1]
md0 : active raid1 sdc[1] sdb[0]
104857536 blocks super 1.2 [2/2] [UU]
[======>..............] resync = 35.2% (36945920/104857536) finish=0.5min speed=225685K/sec
[UU] means both disks are active. Resync completes when resync=100%.
Step 3: Format and Mount the Array
Once the array is ready, create a filesystem (we’ll use ext4, the most common Linux filesystem):
sudo mkfs.ext4 /dev/md0 # Format /dev/md0 as ext4
Mount the array to a directory (e.g., /mnt/raid1):
sudo mkdir -p /mnt/raid1 # Create mount point
sudo mount /dev/md0 /mnt/raid1 # Mount the array
Verify with df -h:
Filesystem Size Used Avail Use% Mounted on
/dev/md0 98G 60M 93G 1% /mnt/raid1
Step 4: Persist Mounts Across Reboots
To mount /dev/md0 automatically on boot, add it to /etc/fstab. First, get the array’s UUID with blkid:
sudo blkid /dev/md0
Output (example):
/dev/md0: UUID="a1b2c3d4-1234-5678-90ab-cdef01234567" TYPE="ext4"
Edit /etc/fstab with sudo nano /etc/fstab and add:
UUID=a1b2c3d4-1234-5678-90ab-cdef01234567 /mnt/raid1 ext4 defaults 0 0
Test the fstab entry with sudo mount -a (no errors = success).
6. Step-by-Step Implementation: RAID 5 (Striping with Parity)
RAID 5 requires 3+ disks. We’ll use /dev/sdb, /dev/sdc, /dev/sdd (3x100GB disks = 200GB usable space).
Step 1: Create the RAID 5 Array
sudo mdadm --create /dev/md0 --level=5 --raid-devices=3 /dev/sdb /dev/sdc /dev/sdd
--level=5: RAID 5.--raid-devices=3: 3 disks.
Check status with cat /proc/mdstat—resync will take longer than RAID 1 due to parity calculation.
Step 2: Format, Mount, and Persist
Same as RAID 1:
sudo mkfs.ext4 /dev/md0
sudo mkdir -p /mnt/raid5
sudo mount /dev/md0 /mnt/raid5
Add to /etc/fstab using the UUID (via blkid /dev/md0).
7. Verifying and Managing RAID Arrays
Check Array Status
Detailed info about /dev/md0:
sudo mdadm --detail /dev/md0
Key output:
State: Active/clean (healthy), Degraded (1+ disks failed).Active Devices: Number of working disks.Failed Devices: Disks that need replacement.
Monitor RAID Health
Enable mdadm monitoring to get alerts on disk failures:
sudo mdadm --monitor --scan --daemon # Runs in background
To log alerts to a file, edit /etc/mdadm/mdadm.conf and add:
MAILADDR [email protected] # Sends alerts to email (requires mail setup)
Replace a Failed Disk (RAID 1/5/6/10)
If a disk fails (e.g., /dev/sdb in RAID 1):
-
Identify the failed disk:
sudo mdadm --detail /dev/md0 | grep "Failed Devices" # Output: Failed Devices : 1 (sdb) -
Mark the disk as failed:
sudo mdadm /dev/md0 --fail /dev/sdb -
Remove the failed disk:
sudo mdadm /dev/md0 --remove /dev/sdb -
Add the new disk (e.g.,
/dev/sde):sudo mdadm /dev/md0 --add /dev/sde -
Monitor resync:
cat /proc/mdstat # Resync progress
Grow/Expand an Array (RAID 5/6)
Some RAID levels (e.g., RAID 5, 6) support adding more disks to increase capacity. For RAID 5 with 3 disks, add a 4th disk (/dev/sde):
sudo mdadm --add /dev/md0 /dev/sde # Add the new disk
sudo mdadm --grow /dev/md0 --raid-devices=4 # Expand to 4 disks
After the array grows, resize the filesystem:
sudo resize2fs /dev/md0 # For ext4; use xfs_growfs for XFS
8. Troubleshooting Common Issues
Array Not Assembling on Boot
- Cause:
mdadm.confmissing array info. - Fix: Update
/etc/mdadm/mdadm.conf:sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf sudo update-initramfs -u # Update initramfs to detect array on boot
Disk Not Detected
- Check connections: For physical disks, ensure cables are secure.
- Verify disk health: Use
smartctl(install withsudo apt install smartmontools):sudo smartctl -H /dev/sdb # Checks disk health (PASSED = good)
Filesystem Errors
- Run
fsckon the unmounted array:sudo umount /mnt/raid1 sudo fsck /dev/md0
9. Conclusion
RAID is a powerful tool for balancing data redundancy and performance in Linux. With mdadm, setting up RAID 1 (mirroring) or RAID 5 (striping with parity) is straightforward, even for beginners. Remember:
- RAID ≠ backup: Always back up data separately.
- Test in a VM first: Use virtual disks to practice without risk.
- Monitor arrays: Regularly check status with
mdadm --detailand set up alerts.
By following this tutorial, you’re ready to implement RAID in your Linux environment and protect your data from disk failures.