This repository provides a Dockerfile to create a container image with pre-installed Linux crisis tools, as recommended in Brendan Gregg’s blog post “Linux Crisis Tools.” The image is designed for debugging performance issues in Kubernetes production environments, ensuring essential diagnostic tools are readily available without installation delays during outages.
When a performance issue causes an outage in a Kubernetes cluster, installing diagnostic tools on the fly can waste critical time. This Docker image pre-installs a comprehensive set of Linux crisis tools to enable rapid debugging of performance bottlenecks in Kubernetes production environments.
The Dockerfile installs the following packages, as recommended by Brendan Gregg, on an Ubuntu base image:
| Package | Provides | Notes |
|---|---|---|
procps |
ps(1), vmstat(1), uptime(1), top(1) |
Basic stats |
util-linux |
dmesg(1), lsblk(1), lscpu(1) |
System log, device info |
sysstat |
iostat(1), mpstat(1), pidstat(1), sar(1) |
Device stats |
iproute2 |
ip(1), ss(8), nstat(8), tc(8) |
Preferred net tools |
numactl |
numastat(8) |
NUMA stats |
tcpdump |
tcpdump(8) |
Network sniffer |
linux-tools-common, linux-tools-$(uname -r) |
perf(1), turbostat(8) |
Profiler and PMU stats |
bpfcc-tools |
opensnoop(1), execsnoop(8), runqlat(8), biotop(8), biosnoop(8), biolatency(8), tcptop(8), tcplife(8), trace(8), argdist(8), funccount(8), profile(8), etc. |
Canned eBPF tools |
bpftrace |
bpftrace(8), basic versions of opensnoop(8), execsnoop(8), runqlat(8), biosnoop(8), etc. |
eBPF scripting |
trace-cmd |
trace-cmd(1) |
Ftrace CLI |
nicstat |
nicstat(1) |
Net device stats |
ethtool |
ethtool(8) |
Net device info |
tiptop |
tiptop(1) |
PMU/PMC top |
Note
Some tools (e.g., bpfcc-tools, bpftrace) require kernel headers and specific privileges to function fully. Ensure your Kubernetes environment grants necessary permissions (e.g., SYS_ADMIN capabilities or privileged mode) for eBPF and tracing tools.
- Kubernetes cluster for deploying the container.
- Familiarity with Kubernetes debugging workflows (e.g.,
kubectl execfor accessing containers). - For eBPF tools (e.g.,
bpfcc-tools,bpftrace), ensure the Kubernetes node kernel supports BPF and the container has appropriate capabilities.
To use this image in a Kubernetes cluster for debugging:
-
Create a pod manifest to run the image with appropriate permissions. For example:
apiVersion: v1 kind: Pod metadata: name: crisis-tools-pod spec: hostPID: true hostUsers: true securityContext: seccompProfile: type: Unconfined volumes: - name: sys-kernel-debug hostPath: path: /sys/kernel/debug type: Directory - name: sys-fs-bpf hostPath: path: /sys/fs/bpf type: Directory containers: - name: crisis-tools image: docker.io/basilcrow/crisis-tools:latest securityContext: privileged: true volumeMounts: - name: sys-kernel-debug mountPath: /sys/kernel/debug - name: sys-fs-bpf mountPath: /sys/fs/bpf command: ["sleep", "infinity"]
-
Apply the pod manifest:
$ kubectl apply -f crisis-tools-pod.yaml
-
Use
kubectl execto access the pod and run diagnostic commands:$ kubectl exec -it crisis-tools-pod -- bash -
Inside the pod, you can run diagnostic commands like:
# uptime # dmesg | tail # vmstat 1 # mpstat -P ALL 1 # pidstat 1 # iostat -xz 1 # free -m # sar -n DEV 1 # sar -n TCP,ETCP 1 # top # opensnoop.bt # tcpdump -i eth0 # perf stat -a sleep 10 # bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[comm] = count(); }'
Warning
Running privileged containers or granting capabilities like SYS_ADMIN can pose security risks. Use this image only in controlled debugging scenarios and remove the pod after use.
If you believe additional crisis tools should be included or have improvements to the Dockerfile, please open an issue or submit a pull request. Ensure any suggested tools align with the goal of lightweight, essential diagnostics for Kubernetes production environments.
This repository is licensed under the Apache License, Version 2.0. See the LICENSE file for details.