thelinuxvault guide

Debugging Linux Kernel Modules: Practical Approaches

Linux kernel modules are critical for extending the kernel’s functionality—from device drivers to file systems and system calls. However, debugging them is far more challenging than debugging user-space applications. Unlike user-space code, kernel modules operate in a restricted environment with no access to standard libraries, limited error-handling mechanisms, and the risk of crashing the entire system if something goes wrong. A single bug can lead to kernel panics, data corruption, or unpredictable behavior. This blog demystifies kernel module debugging by exploring **practical, hands-on approaches** used by developers. We’ll cover tools and techniques tailored to the kernel’s constraints, from simple print-based debugging to advanced tracing and remote debugging. Whether you’re a novice kernel developer or an experienced engineer, this guide will equip you with the skills to diagnose and fix issues in kernel modules effectively.

Table of Contents

  1. Prerequisites
  2. Common Challenges in Kernel Module Debugging
  3. Debugging Methods
  4. Best Practices
  5. Conclusion
  6. References

Prerequisites

Before diving in, ensure you have the following setup:

  • Kernel Source and Headers: Install the kernel source code and headers for your target kernel version (e.g., linux-source-5.4 on Ubuntu).
  • Build Tools: gcc, make, binutils, and libncurses-dev (for kernel configuration).
  • Debugging Tools: gdb, kgdb, kprobes, ftrace, qemu-system-x86_64, and crash.
  • Test Environment: A virtual machine (VM) or embedded device to test modules safely (avoid debugging on a production system!).
  • Kernel Configuration: Enable debugging options in the kernel (via make menuconfig):
    • CONFIG_DEBUG_INFO: Generates debug symbols (critical for tools like GDB).
    • CONFIG_KGDB: Enables KGDB remote debugging.
    • CONFIG_DEBUG_FS: Enables debugfs (used by dynamic debug and ftrace).
    • CONFIG_KPROBES: Enables Kprobes for function tracing.
    • CONFIG_KDUMP: Enables kernel crash dumping (for post-mortem analysis).

Common Challenges in Kernel Module Debugging

Kernel module debugging is uniquely difficult due to:

  • No User-Space Libraries: Modules can’t use printf, malloc, or standard C libraries. Instead, they rely on kernel-specific functions like printk and kmalloc.
  • System-Wide Impact: A bug (e.g., a NULL pointer dereference) can crash the kernel, taking down the entire system.
  • Limited Visibility: Direct access to memory is restricted, and race conditions or concurrency issues (e.g., unprotected shared data) are hard to reproduce.
  • Timing Sensitivity: Debugging tools can alter execution timing, masking race conditions or deadlocks.

Debugging Methods

Let’s explore the most effective tools and techniques for debugging kernel modules.

1. Printk: The Kernel’s “Hello World” of Debugging

printk is the kernel’s equivalent of printf, but with critical differences: it outputs to the kernel log buffer (not stdout) and supports log levels to prioritize messages. It’s the simplest debugging tool and ideal for initial diagnostics.

How It Works:

  • Log Levels: printk messages are tagged with severity levels (defined in <linux/printk.h>), from highest (KERN_EMERG) to lowest (KERN_DEBUG). The kernel only logs messages with a level higher than the current console_loglevel (configurable via sysctl kernel.printk).

    // Log levels (from highest to lowest severity)
    KERN_EMERG    // System is unusable (e.g., "Kernel panic")
    KERN_ALERT    // Action must be taken immediately
    KERN_CRIT     // Critical conditions (e.g., hardware errors)
    KERN_ERR      // Errors (e.g., failed allocations)
    KERN_WARNING  // Warnings (e.g., deprecated API usage)
    KERN_NOTICE   // Normal but significant events
    KERN_INFO     // Informational messages (e.g., module load)
    KERN_DEBUG    // Debug-level messages (disabled by default)
  • Output: Messages are stored in the kernel log buffer. Use dmesg to view them, or check /var/log/kern.log (persistent log) on most systems.

Example: Using printk in a Module

Consider a simple module that logs messages at different levels:

#include <linux/init.h>
#include <linux/module.h>
#include <linux/printk.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("printk Debugging Example");

static int __init mymodule_init(void) {
    printk(KERN_EMERG "EMERG: Module loaded (emergency)\n");
    printk(KERN_ALERT "ALERT: Module loaded (alert)\n");
    printk(KERN_ERR "ERR: Module loaded (error)\n");
    printk(KERN_DEBUG "DEBUG: Module loaded (debug)\n"); // May not show by default
    return 0;
}

static void __exit mymodule_exit(void) {
    printk(KERN_INFO "INFO: Module unloaded\n");
}

module_init(mymodule_init);
module_exit(mymodule_exit);

Build and Load the Module:

make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
sudo insmod mymodule.ko

View Logs:

dmesg | grep "mymodule"  # Filter module-specific messages
# Or check persistent logs:
tail -f /var/log/kern.log

Enable Debug Messages: By default, KERN_DEBUG messages are suppressed. To enable them, temporarily lower the console_loglevel:

sudo sysctl kernel.printk=8  # 8 = KERN_DEBUG (enable all levels)

Pros/Cons:

  • Pros: Simple, no setup required, works in all kernel versions.
  • Cons: Clutters logs, can’t dynamically enable/disable (without dynamic debug), and may alter timing (avoid in performance-critical code).

2. Dynamic Debug: Fine-Grained Print Control

Dynamic debug extends printk by allowing you to enable/disable specific messages at runtime via debugfs, without recompiling the kernel or module. It’s ideal for debugging without cluttering logs permanently.

How It Works:

  • Debugfs Interface: Dynamic debug uses debugfs (mounted at /sys/kernel/debug/). The dynamic_debug/control file lets you enable messages by module, function, file, or format string.
  • Mark Debug Messages: Use pr_debug() (or dev_dbg() for device drivers) instead of printk(KERN_DEBUG ...). These macros expand to __dynamic_pr_debug, which is controlled by dynamic debug.

Example: Enabling Dynamic Debug

  1. Modify the Module to use pr_debug:

    #include <linux/init.h>
    #include <linux/module.h>
    #include <linux/printk.h>
    
    MODULE_LICENSE("GPL");
    
    static int __init mymodule_init(void) {
        pr_debug("mymodule: Debug message from init (file: %s, line: %d)\n", __FILE__, __LINE__);
        return 0;
    }
    
    module_init(mymodule_init);
    module_exit(mymodule_exit);
  2. Load the Module and list available dynamic debug messages:

    sudo insmod mymodule.ko
    sudo cat /sys/kernel/debug/dynamic_debug/control | grep "mymodule"
    # Output: mymodule.c:10 [mymodule]mymodule_init -d"mymodule: Debug message from init (file: %s, line: %d)\n"
  3. Enable Messages by pattern (e.g., module name):

    sudo echo "module mymodule +p" > /sys/kernel/debug/dynamic_debug/control
    # "+p" = enable printing; other flags: +f (show function), +l (show line), +m (show module)
  4. View Debug Output:

    dmesg | grep "mymodule: Debug message"

Pros/Cons:

  • Pros: Dynamically enable/disable messages, no recompilation, reduces log clutter.
  • Cons: Requires CONFIG_DEBUG_FS and CONFIG_DYNAMIC_DEBUG in the kernel.

3. KGDB: Remote Debugging with GDB

KGDB (Kernel GDB) lets you debug the kernel remotely using GDB, just like user-space debugging. It supports breakpoints, stack traces, variable inspection, and memory dumps—critical for diagnosing complex bugs.

Setup Steps:

  1. Configure the Kernel: Enable KGDB via make menuconfig:

    • Kernel hacking → KGDB: kernel debugger
    • KGDB: use kgdb over serial (or kgdb over USB for embedded systems).
  2. Boot the Kernel with KGDB Options: Add kgdboc=ttyS0,115200 kgdbwait to the kernel command line (e.g., in grub.cfg). This tells the kernel to wait for a GDB connection on serial port ttyS0 at 115200 baud.

  3. Connect GDB from a Host Machine:

    • On the target (the system running the module), the kernel will pause at boot with a message like: Waiting for connection from remote gdb....
    • On the host machine, run:
      gdb vmlinux  # vmlinux is the uncompressed kernel image with debug symbols
      (gdb) target remote /dev/ttyUSB0  # Use the serial port connected to the target

Debugging a Module with KGDB:

  1. Load Symbols for the Module: Once connected, load the module’s debug symbols into GDB:

    (gdb) add-symbol-file mymodule.ko 0xffffffffc0000000  # Use the module's load address (from /proc/modules)
  2. Set Breakpoints:

    (gdb) break mymodule_init  # Break at module initialization
    (gdb) continue  # Resume kernel execution
  3. Inspect State: When the breakpoint hits, use GDB commands like print, backtrace, info registers, or x (examine memory):

    (gdb) print some_variable  # Inspect a variable in the module
    (gdb) backtrace  # Show call stack
    (gdb) x/10xw 0xffffffffc0000000  # Examine memory at the module's load address

Pros/Cons:

  • Pros: Full GDB functionality (breakpoints, variables, stack traces), ideal for complex logic bugs.
  • Cons: Requires kernel reconfiguration, slow (pauses the entire system), and needs a serial/network connection.

4. Kprobes: Tracing Kernel Functions

Kprobes is a powerful framework for dynamically tracing kernel functions without modifying their code. It lets you attach “probes” to any kernel function (even in built-in code, not just modules) and execute custom handlers when the function is called or returns.

Types of Kprobes:

  • Kprobes: Trigger a handler before (pre_handler) and after (post_handler) a function executes.
  • Kretprobes: Trigger a handler after a function returns (via a trampoline).
  • Jprobes: Inherit the arguments of the probed function (simpler for argument inspection).

Example: Trace a Kernel Function with Kprobes

Let’s write a Kprobe to trace calls to sys_open (the system call for opening files).

Kprobe Module Code:

#include <linux/kprobes.h>
#include <linux/module.h>
#include <linux/sched.h>

MODULE_LICENSE("GPL");

// Pre-handler: Runs before sys_open is called
static int pre_handler(struct kprobe *p, struct pt_regs *regs) {
    struct task_struct *task = current; // Current process
    printk(KERN_INFO "KPROBE: sys_open called by %s (PID: %d)\n", task->comm, task->pid);
    return 0;
}

// Post-handler: Runs after sys_open executes
static void post_handler(struct kprobe *p, struct pt_regs *regs, unsigned long flags) {
    printk(KERN_INFO "KPROBE: sys_open returned\n");
}

// Define the kprobe (attach to sys_open)
static struct kprobe kp = {
    .symbol_name = "sys_open",  // Function to probe
    .pre_handler = pre_handler,
    .post_handler = post_handler,
};

static int __init kprobe_init(void) {
    int ret;
    ret = register_kprobe(&kp);
    if (ret < 0) {
        printk(KERN_ERR "Failed to register kprobe: %d\n", ret);
        return ret;
    }
    printk(KERN_INFO "Kprobe registered for sys_open\n");
    return 0;
}

static void __exit kprobe_exit(void) {
    unregister_kprobe(&kp);
    printk(KERN_INFO "Kprobe unregistered\n");
}

module_init(kprobe_init);
module_exit(kprobe_exit);

Build and Load:

make -C /lib/modules/$(uname -r)/build M=$(pwd) modules
sudo insmod kprobe_example.ko

Test It: Open a file (e.g., cat /etc/hosts), then check logs:

dmesg | grep "KPROBE"
# Output: KPROBE: sys_open called by cat (PID: 1234)
#         KPROBE: sys_open returned

Pros/Cons:

  • Pros: Non-intrusive (no code modification), traces any kernel function, works on production systems (with caution).
  • Cons: Complex to write, risk of crashing the kernel if handlers have bugs, limited to function-level tracing.

5. Ftrace: Function Tracing and Profiling

Ftrace is a built-in kernel tracing framework designed to debug latency, function calls, and concurrency issues. It uses tracefs (mounted at /sys/kernel/tracing/) to expose tracing controls and output.

Key Ftrace Features:

  • Function Tracer: Logs all kernel function calls (with timestamps).
  • Function Graph Tracer: Visualizes function call graphs (parent/child relationships).
  • Event Tracer: Traces kernel events (e.g., scheduler, memory allocations, module loads).

Example: Trace Module Function Calls

Let’s trace the init and exit functions of our earlier mymodule.

Enable Function Tracing:

sudo mount -t tracefs nodev /sys/kernel/tracing  # Mount tracefs (if not mounted)
cd /sys/kernel/tracing

# Enable function tracing and filter by module
echo function > current_tracer
echo mymodule > set_ftrace_filter  # Only trace functions in "mymodule"
echo 1 > tracing_on  # Start tracing

Load the Module:

sudo insmod /path/to/mymodule.ko
sudo rmmod mymodule

View Tracing Output:

cat trace
# Output example:
# mymodule-1234  [001] d... 12345.678901: mymodule_init <- do_one_initcall
# mymodule-1234  [001] d... 12345.678905: mymodule_exit <- sys_delete_module

Pros/Cons:

  • Pros: Lightweight, low overhead, excellent for profiling and call graphs.
  • Cons: Limited to tracing (can’t modify execution), requires CONFIG_FTRACE in the kernel.

6. QEMU + GDB: Virtualized Debugging

Testing kernel modules directly on physical hardware is risky (a panic crashes the system). Instead, use QEMU to run a virtual machine (VM) with a debuggable kernel, and debug the module remotely via GDB.

Setup Steps:

  1. Build a Debug Kernel: Compile a custom kernel with CONFIG_DEBUG_INFO and CONFIG_GDB_SCRIPTS enabled.

  2. Create a VM Image: Use debootstrap to create a minimal root filesystem for the VM.

  3. Run QEMU with Debugging: Start the VM with QEMU, enabling a GDB stub:

    qemu-system-x86_64 -kernel /path/to/bzImage -drive file=rootfs.img,format=raw -s -S -nographic
    # -s: Listen for GDB on port 1234
    # -S: Pause at startup (wait for GDB connection)
  4. Connect GDB to the VM:

    gdb /path/to/vmlinux -ex "target remote localhost:1234"
  5. Load the Module in the VM: Copy the module to the VM (via scp or shared folder) and load it with insmod. Use GDB to set breakpoints and debug as with KGDB.

Pros/Cons:

  • Pros: Safe (no risk to host), easy to reproduce bugs, integrates with GDB.
  • Cons: Slow (VM overhead), requires disk space for VM images.

7. Post-Mortem Debugging with crash and Kdump

When the kernel panics, Kdump captures a vmcore (kernel memory dump), which can be analyzed later with the crash tool. This is critical for debugging hard-to-reproduce crashes.

Setup Kdump:

  1. Enable Kdump: On Debian/Ubuntu:

    sudo apt install kdump-tools
    sudo systemctl enable kdump-tools
  2. Configure Kdump: Edit /etc/default/grub to reserve memory for the crash kernel. Add crashkernel=128M to GRUB_CMDLINE_LINUX, then update GRUB:

    sudo update-grub
    sudo reboot
  3. Trigger a Panic: For testing, force a kernel panic (in a VM!):

    echo c > /proc/sysrq-trigger  # Requires CONFIG_MAGIC_SYSRQ
  4. Analyze the vmcore: After reboot, the vmcore is saved to /var/crash/. Use crash to inspect it:

    sudo crash /usr/lib/debug/vmlinux-$(uname -r) /var/crash/202401011234/vmcore

Common crash Commands:

  • bt: Show the panic stack trace.
  • ps: List processes at the time of the crash.
  • mod: List loaded modules.
  • disassemble: Disassemble kernel code near the crash.

Pros/Cons:

  • Pros: Debug crashes after they occur, no need to reproduce the bug.
  • Cons: Requires preconfiguration, vmcore files can be large (GBs).

Best Practices

  • Test in a VM: Never debug on production hardware—use QEMU or VirtualBox to isolate crashes.
  • Enable Debug Symbols: Always build modules with -g and enable CONFIG_DEBUG_INFO in the kernel.
  • Start Simple: Use printk/dynamic debug for initial diagnostics before moving to KGDB or Kprobes.
  • Document Bugs: Reproduce steps, logs, and stack traces to share with the community (e.g., LKML).
  • Avoid Race Conditions: Use smp_processor_id() or dump_stack() to debug concurrency issues.

Conclusion

Debugging Linux kernel modules requires a mix of tools and patience. From simple printk logs to advanced tracing with Kprobes or post-mortem analysis with crash, each method solves specific problems. Start with dynamic debug for quick diagnostics, use KGDB for complex logic bugs, and rely on Kprobes/Ftrace for tracing. Always test in a VM, and enable debug symbols to make the most of these tools.

With these approaches, you’ll be well-equipped to tackle even the trickiest kernel module bugs.

References