thelinuxvault guide

Automating System Administration Tasks on Linux with Bash

System administration on Linux often involves repetitive, time-consuming tasks—backing up data, monitoring system health, rotating logs, managing users, and updating packages, to name a few. Manually performing these tasks increases the risk of human error, wastes valuable time, and undermines consistency across systems. **Bash (Bourne Again Shell)** emerges as a powerful ally for automation. Preinstalled on nearly all Linux distributions, Bash provides a robust scripting environment to automate these tasks with minimal overhead. It integrates seamlessly with Linux utilities (e.g., `tar`, `grep`, `df`), supports conditional logic, loops, and functions, and requires no additional dependencies. This blog will guide you through the fundamentals of Bash scripting for system administration, walk through practical automation examples, and share best practices to write reliable, maintainable scripts. Whether you’re a junior sysadmin or a seasoned engineer, mastering Bash automation will streamline your workflow and make you more efficient.

Table of Contents

  1. Why Bash for System Automation?
  2. Essential Bash Concepts for Automation
    • Variables & Environment
    • Loops (For, While, Until)
    • Conditionals (If-Else, Case)
    • Functions
    • Command Substitution & Exit Codes
  3. Practical Automation Examples
    • Automated Backups
    • Log Rotation
    • System Resource Monitoring
    • Bulk User Management
    • Service Health Check & Restart
    • Automated Package Updates
  4. Best Practices for Bash Scripting
  5. Advanced Tips & Tools
    • Scheduling with cron
    • Error Handling with trap
    • Text Processing with awk/sed
  6. Conclusion
  7. References

Why Bash for System Automation?

Before diving into scripting, let’s clarify why Bash is ideal for system administration automation:

  • Ubiquity: Bash is preinstalled on every Linux/Unix system. No need to install additional tools (unlike Python or Ansible, which require setup).
  • Integration: Bash natively interacts with Linux core utilities (ls, grep, tar, systemctl) and system calls, making it easy to chain commands.
  • Simplicity: Bash scripts are lightweight and readable, even for beginners. You don’t need to learn complex syntax to write basic automation.
  • Speed: For small-to-medium tasks, Bash scripts execute faster than interpreted languages (e.g., Python) since they avoid startup overhead.
  • Flexibility: Combine Bash with tools like cron (scheduling), mail (alerts), and awk (text processing) to build powerful workflows.

Essential Bash Concepts for Automation

To write effective Bash scripts, you’ll need to master these core concepts:

1. Variables

Store data (paths, timestamps, thresholds) for reuse. Use = to assign values (no spaces around =), and $variable to access them.

# Declare variables
BACKUP_DIR="/var/backups"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)  # Command substitution (see below)

# Access variables (quote to handle spaces in values)
echo "Backup will be stored in: $BACKUP_DIR"

Environment Variables: Predefined variables like $HOME (user’s home), $PATH (executable search path), and $USER (current user).

2. Loops

Repeat actions (e.g., process files, iterate over users).

  • for Loop: Iterate over a list.

    # Backup all .log files in /var/log
    for logfile in /var/log/*.log; do
      echo "Backing up: $logfile"
      cp "$logfile" "$BACKUP_DIR/"
    done
  • while Loop: Run until a condition fails.

    # Monitor a service until it starts
    SERVICE="nginx"
    while ! systemctl is-active --quiet "$SERVICE"; do
      echo "$SERVICE is down. Retrying in 5s..."
      sleep 5
    done
    echo "$SERVICE is running!"
  • until Loop: Run until a condition succeeds (opposite of while).

3. Conditionals

Make decisions (e.g., check if a file exists, if disk usage is high).

  • if-else Statements:

    # Check if backup directory exists
    if [ -d "$BACKUP_DIR" ]; then
      echo "Backup directory exists."
    else
      echo "Creating backup directory: $BACKUP_DIR"
      mkdir -p "$BACKUP_DIR"  # -p creates parent dirs if missing
    fi

    Common condition checks:

    • -d "$dir": Directory exists.
    • -f "$file": File exists.
    • -z "$var": Variable is empty.
    • $a -gt $b: $a is greater than $b (numeric comparison).
  • case Statement: Simplify multiple if-else checks.

    # Handle script arguments
    case "$1" in
      start) systemctl start nginx ;;
      stop) systemctl stop nginx ;;
      restart) systemctl restart nginx ;;
      *) echo "Usage: $0 {start|stop|restart}" ;;
    esac

4. Functions

Reuse code blocks (e.g., logging, error handling). Define with function name() { ... } or name() { ... }.

# Log messages with timestamps
log() {
  echo "[$(date +%Y-%m-%dT%H:%M:%S)] $1" >> "$LOG_FILE"
}

# Call function (pass arguments like $1, $2, etc.)
LOG_FILE="/var/log/backup.log"
log "Backup process started"

5. Command Substitution

Embed the output of a command into a variable or command. Use $(command) (preferred) or backticks `command`.

# Get disk usage percentage for / partition
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')
echo "Disk usage: $DISK_USAGE%"

6. Exit Codes

Bash uses numeric codes to indicate success/failure:

  • 0: Success.
  • Non-zero (1-255): Failure (e.g., 1 = general error, 2 = incorrect usage).

Check the exit code of the last command with $?:

tar -czf backup.tar.gz /home  # Create backup
if [ $? -eq 0 ]; then  # $? = exit code of tar
  echo "Backup succeeded!"
else
  echo "Backup failed (code: $?)."
  exit 1  # Exit script with failure code
fi

Practical Automation Examples

Let’s apply these concepts to real-world sysadmin tasks. Each example includes a script snippet and explanations.

Example 1: Automated Backup Script

Goal: Backup /home to a compressed tarball with a timestamp, log the process, and validate success.

#!/bin/bash
set -euo pipefail  # Exit on error, unset variable, or pipeline failure (see Best Practices)

# Configuration
SOURCE_DIR="/home"
BACKUP_DIR="/var/backups/home"
LOG_FILE="$BACKUP_DIR/backup_logs.txt"
RETENTION_DAYS=30  # Keep backups for 30 days

# Create backup directory if missing
mkdir -p "$BACKUP_DIR"

# Log function
log() {
  echo "[$(date +%Y-%m-%dT%H:%M:%S)] $1" >> "$LOG_FILE"
}

log "Starting backup of $SOURCE_DIR..."

# Generate backup filename with timestamp
BACKUP_FILE="$BACKUP_DIR/home_backup_$(date +%Y%m%d_%H%M%S).tar.gz"

# Perform backup (c = create, z = compress, f = file, v = verbose)
if tar -czvf "$BACKUP_FILE" -C "$SOURCE_DIR" .; then  # -C: change to SOURCE_DIR first
  log "Backup successful: $BACKUP_FILE"
else
  log "ERROR: Backup failed!"
  exit 1
fi

# Cleanup: Delete backups older than RETENTION_DAYS
log "Cleaning up backups older than $RETENTION_DAYS days..."
find "$BACKUP_DIR" -name "home_backup_*.tar.gz" -mtime +"$RETENTION_DAYS" -delete

log "Backup process completed.\n"

How to Use:

  • Save as backup_home.sh.
  • Make executable: chmod +x backup_home.sh.
  • Test: ./backup_home.sh.
  • Schedule with cron (see Advanced Tips) for daily backups.

Example 2: Log Rotation Script

Goal: Compress old logs, keep only the last 7 days of logs, and delete older ones.

#!/bin/bash
set -euo pipefail

LOG_DIR="/var/log"
DAYS_TO_KEEP=7  # Keep logs for 7 days
LOG_FILE="/var/log/log_rotation.log"

log() {
  echo "[$(date +%Y-%m-%dT%H:%M:%S)] $1" >> "$LOG_FILE"
}

log "Starting log rotation..."

# Compress .log files older than 1 day (not modified in 24h)
find "$LOG_DIR" -name "*.log" -type f -mtime +1 -print0 | while IFS= read -r -d $'\0' logfile; do
  log "Compressing: $logfile"
  gzip "$logfile"  # Replaces logfile with logfile.gz
done

# Delete .log.gz files older than DAYS_TO_KEEP
find "$LOG_DIR" -name "*.log.gz" -type f -mtime +"$DAYS_TO_KEEP" -print0 | while IFS= read -r -d $'\0' old_log; do
  log "Deleting old log: $old_log"
  rm "$old_log"
done

log "Log rotation completed.\n"

Example 3: System Resource Monitoring with Alerts

Goal: Check CPU, memory, and disk usage. Send an email alert if thresholds are exceeded.

#!/bin/bash
set -euo pipefail

# Thresholds (adjust as needed)
CPU_THRESHOLD=80  # %
MEM_THRESHOLD=80  # %
DISK_THRESHOLD=85  # %
ALERT_EMAIL="[email protected]"

# Get metrics
CPU_USAGE=$(top -bn1 | awk '/^%Cpu/ {print $2}' | cut -d. -f1)  # 10-second sample
MEM_USAGE=$(free | awk '/Mem:/ {printf "%.0f", $3/$2*100}')  # Used mem %
DISK_USAGE=$(df -h / | awk 'NR==2 {print $5}' | sed 's/%//')  # Root disk %

# Check CPU
if [ "$CPU_USAGE" -gt "$CPU_THRESHOLD" ]; then
  ALERT_MSG="ALERT: CPU usage high ($CPU_USAGE% > $CPU_THRESHOLD%)"
  echo "$ALERT_MSG" | mail -s "System Alert: High CPU" "$ALERT_EMAIL"
fi

# Check memory (similar logic)
if [ "$MEM_USAGE" -gt "$MEM_THRESHOLD" ]; then
  ALERT_MSG="ALERT: Memory usage high ($MEM_USAGE% > $MEM_THRESHOLD%)"
  echo "$ALERT_MSG" | mail -s "System Alert: High Memory" "$ALERT_EMAIL"
fi

# Check disk (similar logic)
if [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ]; then
  ALERT_MSG="ALERT: Disk usage high ($DISK_USAGE% > $DISK_THRESHOLD%)"
  echo "$ALERT_MSG" | mail -s "System Alert: High Disk" "$ALERT_EMAIL"
fi

Note: Install mailutils (Debian/Ubuntu) or postfix to use the mail command.

Example 4: Bulk User Creation

Goal: Create multiple users from a text file, set temporary passwords, and add them to a group.

#!/bin/bash
set -euo pipefail

# Input file: one username per line (e.g., users.txt)
USER_LIST="users.txt"
GROUP="developers"
TEMP_PASSWORD="TempPass123!"  # Reset after first login

# Check if user list exists
if [ ! -f "$USER_LIST" ]; then
  echo "Error: $USER_LIST not found!"
  exit 1
fi

# Create group if it doesn't exist
if ! getent group "$GROUP"; then
  groupadd "$GROUP"
  echo "Created group: $GROUP"
fi

# Read user list and create users
while IFS= read -r username; do
  if id "$username" &>/dev/null; then  # Check if user exists
    echo "User $username already exists. Skipping."
  else
    useradd -m -g "$GROUP" -s /bin/bash "$username"  # -m: create home, -s: shell
    echo "$username:$TEMP_PASSWORD" | chpasswd  # Set password
    chage -d 0 "$username"  # Force password change on first login
    echo "Created user: $username (group: $GROUP, temp password: $TEMP_PASSWORD)"
  fi
done < "$USER_LIST"

Best Practices for Bash Scripting

Follow these practices to write robust, maintainable scripts:

  1. Shebang Line: Start with #!/bin/bash (not #!/bin/sh, which may use a minimal shell like dash).

  2. Strict Error Handling: Use set -euo pipefail to exit on:

    • e: Errors (non-zero exit codes).
    • u: Undefined variables.
    • o pipefail: Pipeline failures (e.g., if cmd1 | cmd2 fails, the script exits).
  3. Quote Variables: Always quote variables ("$VAR") to handle spaces in filenames/paths:

    # Bad: Fails if $FILE has spaces
    cat $FILE  
    
    # Good: Handles spaces
    cat "$FILE"
  4. Avoid Hard-Coded Paths: Use variables for paths (e.g., BACKUP_DIR instead of /var/backups).

  5. Logging: Redirect output to a log file (e.g., >> "$LOG_FILE") instead of relying on stdout.

  6. Comment Liberally: Explain why (not just what) the code does.

  7. Test with shellcheck: A tool to detect bugs and bad practices. Install with apt install shellcheck, then run shellcheck script.sh.

  8. Secure Scripts:

    • Restrict permissions: chmod 700 script.sh (only owner can read/execute).
    • Avoid running as root unless necessary. Use sudo for specific commands instead.
  9. Handle Errors Gracefully: Use trap (see Advanced Tips) to clean up temporary files on exit, or if statements to check command success.

Advanced Tips & Tools

Scheduling with cron

Automate scripts to run at specific times with cron. Edit the crontab with crontab -e, and use this format:

# Minute Hour Day Month Weekday Command
0 2 * * * /path/to/backup_home.sh  # Run daily at 2:00 AM

Use crontab.guru to generate schedules.

Error Handling with trap

Catch signals (e.g., Ctrl+C, script exit) to clean up resources (temp files, locks):

#!/bin/bash
TEMP_FILE=$(mktemp)  # Create temporary file

# Cleanup on exit (0 = success, 1 = error, SIGINT = Ctrl+C)
trap 'rm -f "$TEMP_FILE"; echo "Cleaned up temp file."' EXIT SIGINT

# Use $TEMP_FILE...
echo "Temporary data" > "$TEMP_FILE"

Text Processing with awk/sed

Use awk (data extraction) and sed (text substitution) to parse logs, configs, or command output:

# Example: Extract failed SSH login attempts from /var/log/auth.log
awk '/Failed password/ {print $1, $2, $3, $11}' /var/log/auth.log

Conclusion

Bash scripting is a cornerstone of Linux system administration. By automating repetitive tasks—backups, log rotation, monitoring, and user management—you’ll save time, reduce errors, and ensure consistency across systems.

Start small: automate one task (e.g., the backup script above), then build complexity. Use the examples here as templates, and refer to the best practices to refine your scripts. With practice, you’ll unlock the full power of Bash to manage Linux systems efficiently.

References