Table of Contents#
- What Is vmstat?
- Installing vmstat
- Basic Syntax of vmstat
- Decoding vmstat Output
- Common vmstat Usage Examples
- Advanced vmstat Options
- Best Practices for Using vmstat
- Troubleshooting with vmstat: Real-World Scenarios
- Tips for Effective Monitoring
- Conclusion
- References
1. What Is vmstat?#
vmstat is a command-line tool that reports virtual memory statistics for Linux systems. It aggregates data from the /proc filesystem (specifically /proc/stat, /proc/meminfo, and /proc/vmstat) to provide insights into:
- Process activity (running/blocked processes)
- Memory usage (free, buffers, cache)
- Swap space utilization
- Disk I/O (read/write rates)
- System activity (interrupts, context switches)
- CPU utilization (user, system, idle time)
Unlike tools like iostat (for disk I/O) or mpstat (for CPU), vmstat offers a holistic view of system performance—making it a great starting point for troubleshooting.
2. Installing vmstat#
vmstat is part of the procps-ng package, which is pre-installed on nearly all Linux distributions. To verify if it’s installed:
vmstat --versionIf missing, install it using your package manager:
- Debian/Ubuntu:
sudo apt update && sudo apt install procps - RHEL/CentOS/Rocky Linux:
sudo dnf install procps-ng - Fedora:
sudo dnf install procps-ng - Arch Linux:
sudo pacman -S procps-ng
3. Basic Syntax of vmstat#
The basic syntax for vmstat is:
vmstat [options] [interval] [count]| Parameter | Description |
|---|---|
options | Modify output (e.g., -a for active/inactive memory, -t for timestamps) |
interval | Time (in seconds) between consecutive samples |
count | Number of samples to collect (optional; runs indefinitely if omitted) |
Example#
To collect 5 samples at 1-second intervals:
vmstat 1 54. Decoding vmstat Output#
Let’s start with a sample output to explain each column:
vmstat 1 2procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 123456 78901 234567 0 0 0 0 123 456 5 3 92 0 0
0 0 0 123400 78901 234580 0 0 0 10 150 500 6 4 90 0 0
Key notes:
- The first line is the average since system boot.
- Subsequent lines are per-interval measurements (1 second in this case).
- Columns are grouped into 6 categories:
procs,memory,swap,io,system, andcpu.
4.1 Procs (Processes)#
These columns track process state:
| Column | Description | Red Flag |
|---|---|---|
r | Number of running/runnable processes (waiting for CPU time) | r > number of CPU cores |
b | Number of blocked processes (waiting for I/O: disk, network, or semaphores) | b > 2 (consistently) |
Example#
If r = 5 on a 4-core system, 1 process is waiting for CPU time (CPU bottleneck).
4.2 Memory#
Tracks physical memory usage (units: KB):
| Column | Description |
|---|---|
swpd | Amount of memory swapped to disk (swap space used) |
free | Unused physical memory |
buff | Buffer cache: Memory used for block device metadata (e.g., inodes) |
cache | Page cache: Memory used for file data (speeds up repeated reads/writes) |
Key Notes#
- Buffers vs. Cache: Buffers handle metadata; cache handles file content.
- Linux uses unused memory for cache/buffers to improve performance—this is normal.
4.3 Swap#
Tracks swap space activity (units: blocks per second, 512 bytes per block):
| Column | Description | Red Flag |
|---|---|---|
si | Swap in: Pages loaded from swap to memory (disk → RAM) | si > 0 (consistently) |
so | Swap out: Pages written from memory to swap (RAM → disk) | so > 0 (consistently) |
Example#
If si = 100 and so = 100, the system is thrashing (constantly swapping pages—critical issue).
4.4 IO (Input/Output)#
Tracks disk I/O activity (units: blocks per second, 512 bytes per block):
| Column | Description |
|---|---|
bi | Blocks in: Data read from disk (disk → RAM) |
bo | Blocks out: Data written to disk (RAM → disk) |
Example#
bi = 200 → 100 KB/s read from disk (200 × 512 bytes = 102,400 bytes).
4.5 System#
Tracks low-level system activity (per second):
| Column | Description |
|---|---|
in | Number of interrupts (hardware/software) |
cs | Number of context switches (CPU switching between processes/threads) |
Key Notes#
- High
cs(e.g., >10,000) can indicate CPU contention (too many processes competing for time).
4.6 CPU#
Tracks CPU utilization (percentages; sum to 100):
| Column | Description | Red Flag |
|---|---|---|
us | Time spent on user-level processes (e.g., nginx, python) | us > 70% (consistently) |
sy | Time spent on system-level processes (kernel, I/O) | sy > 30% (consistently) |
id | Idle time (CPU doing nothing) | id < 20% (CPU bottleneck) |
wa | I/O wait time (CPU waiting for disk/network) | wa > 20% (disk I/O bottleneck) |
st | Steal time (VM only: time CPU was borrowed by hypervisor) | st > 10% (VM resource contention) |
Critical Example#
If wa = 30%, the CPU is spending 30% of its time waiting for disk I/O—investigate disk performance.
5. Common vmstat Usage Examples#
Let’s explore practical ways to use vmstat.
5.1 Basic One-Time Snapshot#
Run vmstat without parameters for an average since boot:
vmstat5.2 Continuous Monitoring#
To monitor indefinitely (press Ctrl+C to stop):
vmstat 15.3 Custom Intervals and Counts#
Collect 10 samples at 2-second intervals:
vmstat 2 105.4 Filtering Output with grep/awk#
Focus on CPU I/O wait (wa) and running processes (r):
vmstat 1 | awk '{print "r=" $1, "wa=" $16}'Output:
r=1 wa=0
r=0 wa=0
r=2 wa=5
5.5 Adding Timestamps (-t)#
Include timestamps for logging:
vmstat -t 1 5procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- -----timestamp-----
r b swpd free buff cache si so bi bo in cs us sy id wa st UTC
1 0 0 123456 78901 234567 0 0 0 0 123 456 5 3 92 0 0 2024-05-20 12:34:56
0 0 0 123400 78901 234580 0 0 0 10 150 500 6 4 90 0 0 2024-05-20 12:34:57
6. Advanced vmstat Options#
vmstat has several advanced flags to drill into specific metrics.
6.1 Active/Inactive Memory (-a)#
Show active (recently used) and inactive (reclaimable) memory:
vmstat -a 1 2procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free inact active si so bi bo in cs us sy id wa st
1 0 0 123456 789012 345678 0 0 0 0 123 456 5 3 92 0 0
| Column | Description |
|---|---|
inact | Inactive memory (can be reclaimed by kernel if needed) |
active | Active memory (recently used—less likely to be reclaimed) |
6.2 Summary Statistics (-s)#
Print a detailed summary of memory, swap, IO, and CPU:
vmstat -s 3981748 K total memory
567890 K used memory
1234567 K active memory
789012 K inactive memory
1234567 K free memory
78901 K buffer memory
2345678 K swap cache
2097152 K total swap
0 K used swap
2097152 K free swap
123 non-nice user cpu ticks
45 nice user cpu ticks
67 system cpu ticks
98765 idle cpu ticks
10 io wait cpu ticks
0 irq cpu ticks
5 softirq cpu ticks
0 stolen cpu ticks
12345 pages paged in
67890 pages paged out
0 pages swapped in
0 pages swapped out
12345 interrupts
67890 context switches
6.3 Disk Statistics (-d)#
Show per-disk I/O (merged operations, sectors, latency):
vmstat -ddisk- ------------reads------------ ------------writes----------- -----IO------
total merged sectors ms total merged sectors ms cur sec
sda 123 456 7890 123 456 789 12345 678 0 1
sdb 10 20 300 10 20 30 400 20 0 0
| Column | Description |
|---|---|
merged | Merged I/O operations (reduces disk overhead) |
sectors | Number of 512-byte sectors read/written |
ms | Milliseconds spent on I/O |
cur | Current I/O operations in progress |
sec | Seconds spent doing I/O |
6.4 Partition Statistics (-p)#
Show per-partition I/O (e.g., /dev/sda1):
vmstat -p /dev/sda1partition reads read sectors writes requested writes
/dev/sda1 1234 567890 789 123456
| Column | Description |
|---|---|
reads | Number of read operations |
read sectors | Sectors read from the partition |
writes | Number of write operations |
requested writes | Write requests (even if merged) |
6.5 Slab Allocator Statistics (-m)#
Show kernel slab memory (used for small, frequent allocations):
vmstat -mCache Num Total Size Pages
kmalloc-8192 123 4567 8192 1
kmalloc-4096 56 2345 4096 1
ext4_inode_cache 789 1234 1024 2
| Column | Description |
|---|---|
Num | Active objects in the cache |
Total | Total objects allocated |
Size | Size (in bytes) of each object |
Pages | Number of pages used by the cache |
Use Case#
If Total for a cache (e.g., ext4_inode_cache) keeps increasing, it may indicate a kernel memory leak.
6.6 Human-Readable Output#
vmstat outputs memory in KB by default. To convert to human-readable format (MB/GB), pipe through numfmt:
vmstat 1 2 | numfmt --header --field=3-7 --to=iec7. Best Practices for Using vmstat#
To get the most out of vmstat, follow these rules:
7.1 Establish a Baseline#
Run vmstat during normal system operation to create a baseline. When issues arise, compare current metrics to the baseline to spot anomalies.
7.2 Combine with Other Tools#
vmstat is powerful, but it’s not a silver bullet. Use it with:
iostat: Deep dive into disk I/Otop/htop: Identify resource-hungry processessar: Historical performance datafree: Detailed memory usage
7.3 Monitor Over Time#
One-off snapshots are rarely useful. Collect data over minutes/hours to capture trends (e.g., vmstat 1 3600 > vmstat.log for 1 hour of data).
7.4 Avoid Misinterpreting Swap#
Some swap usage is normal (Linux uses swap to free up cache for active processes). Only worry if si/so are consistently > 0 (swap thrashing).
7.5 Understand Units#
Remember:
- Memory: KB (use
-hfor human-readable) - I/O: Blocks (512 bytes)
- CPU: Percentages
8. Troubleshooting with vmstat: Real-World Scenarios#
Let’s solve common performance issues using vmstat.
8.1 High CPU Utilization#
Symptom: id < 20% (low idle time), us or sy high.
Example:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa st
5 0 0 102400 51200 204800 0 0 0 0 500 1000 75 15 10 0 0
Analysis: r = 5 (5 runnable processes) on a 4-core system → CPU bottleneck. us = 75% (user processes are consuming CPU).
Fix:
- Use
topto find CPU-hungry processes (e.g., a misbehavingpythonscript). - Optimize the process or add CPU cores.
8.2 Memory Bottlenecks#
Symptom: free is low, si/so > 0 (swap in/out).
Example:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd 10Mi 50Mi 20Mi 50 50 100 100 300 600 20 10 50 20 0
Analysis: free = 10Mi (very low), si = 50, so = 50 (swap thrashing).
Fix:
- Use
free -hto confirm memory usage. - Kill memory-hungry processes (e.g.,
kill -9 <PID>) or add RAM.
8.3 Disk I/O Wait (High wa)#
Symptom: wa > 20% (CPU waiting for disk).
Example:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd 100Mi 50Mi 200Mi 0 0 500 800 400 800 10 5 55 30 0
Analysis: wa = 30% (CPU waits 30% of the time for disk). bi = 500, bo = 800 (high I/O).
Fix:
- Use
iostat -x 1to find the slow disk (look for high%util).iostat -x 1avg-cpu: %user %nice %system %iowait %steal %idle 10.00 0.00 5.00 30.00 0.00 55.00 Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util sda 5.00 80.00 250.00 4000.00 0.00 10.00 0.00 11.11 2.00 4.00 0.30 50.00 50.00 1.00 85.00 - If
sdahas%util = 85%, check:- Is the filesystem full? (
df -h) - Which processes are writing to it? (
lsof +D /path/to/mount)
- Is the filesystem full? (
- Optimize: Add faster disks (SSD), reduce I/O (e.g., disable unnecessary logs).
8.4 Swap Thrashing#
Symptom: si/so consistently > 0 (swap in/out).
Example:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd 500Mi 20Mi 30Mi 100 100 200 200 500 1000 15 10 45 30 0
Analysis: si = 100, so = 100 → system is swapping pages in/out constantly. This kills performance.
Fix:
- Use
free -hto confirm low memory:free -htotal used free shared buff/cache available Mem: 3.8Gi 3.2Gi 50Mi 100Mi 500Mi 200Mi Swap: 2.0Gi 500Mi 1.5Gi available = 200Mi(very low). Kill memory-hungry processes or add RAM.
9. Tips for Effective Monitoring#
- Log with Timestamps: Use
vmstat -t 1 > vmstat.logto add timestamps. - Watch Real-Time: Use
watch -n 1 'vmstat'for live updates. - Filter Columns: Use
awkto focus on critical metrics (e.g.,vmstat 1 | awk '{print $1, $16}'forrandwa). - Automate Alerts: Use tools like
Prometheus/GrafanaorNagiosto alert on anomalies (e.g.,wa > 20%).
10. Conclusion#
vmstat is a must-know tool for any Linux system administrator or developer. It provides a quick, high-level view of system performance and is invaluable for troubleshooting bottlenecks.
Remember:
- Baseline first: Know what “normal” looks like.
- Combine tools:
vmstat+iostat+top= unbeatable troubleshooting. - Monitor over time: Trends matter more than one-off snapshots.
With practice, you’ll be able to use vmstat to diagnose and fix performance issues in minutes.
11. References#
vmstatMan Page: man7.org/linux/man-pages/man8/vmstat.8.html- Linux Performance by Brendan Gregg: brendangregg.com/linuxperf.html
- Procps-ng Documentation: gitlab.com/procps-ng/procps
- Sysstat Documentation: sysstat.github.io
Let me know in the comments if you have questions or want to share your vmstat tips! 🚀