Case Study & Best Practices

Kubernetes Storage Incident Deep Dive (RKE2)

From Failure to Best Practices

Architecture Diagram (Storage Flow)

This diagram outlines the flow of ephemeral storage resources from the application pod down to the node disk, highlighting the key monitoring checkpoints.

flowchart TD subgraph Pod[Application Pod] App[Data Processing / API Worker] end subgraph Node[Node Environment] Ephemeral[Ephemeral Storage
emptyDir / /tmp FS] RunK3s["/run/k3s (runtime)
tmpfs / mounts"] VarLib["/var/lib/rancher
rke2/agent
container layers & volumes"] NodeDisk[Node Disk VM] end subgraph Monitoring[Monitoring Layer] Prom[Prometheus + Grafana
node_filesystem_usage
container_fs_usage] end App -->|Temp Data| Ephemeral Ephemeral -->|Runtime Data| RunK3s Ephemeral -->|Persistent Data| VarLib RunK3s -->|Saturates| NodeDisk VarLib -->|Stores to| NodeDisk Prom -.->|Monitors| NodeDisk classDef default fill:#f9fafb,stroke:#d1d5db,stroke-width:2px; classDef disk fill:#fee2e2,stroke:#ef4444,stroke-width:2px,color:#991b1b; classDef monitor fill:#eff6ff,stroke:#3b82f6,stroke-width:2px,color:#1e40af; class NodeDisk disk; class Prom monitor;

Introduction

In modern cloud-native environments, most engineers focus heavily on CPU and memory. However, during a recent production-like scenario, I encountered a critical issue that highlighted a less-discussed but equally important resource: Ephemeral Storage in Kubernetes.

This article walks through:

How the issue happened
How RKE2 manages storage internally
Step-by-step troubleshooting methodology
Immediate recovery actions
Long-term best practices

Environment:
Kubernetes: RKE2 (v1.31)
OS: Ubuntu 22.04
Runtime: containerd

The Incident

Pods started getting evicted with the following event log:

Log

The node was low on resource: ephemeral-storage

At the same time, the cluster experienced cascading effects:

Nodes became unstable
Disk usage reached 100%
Services degraded

Investigation

1. Disk Check

Bash

df -h

Result: Root disk fully saturated.

2. Identify Large Directories

Bash

sudo du -h / --max-depth=1 | sort -rh

Key findings:

/var → very large
/run → abnormally large

3. Deep Dive into Specific Paths

Bash

sudo du -h /var/lib/rancher --max-depth=2 | sort -rh
sudo du -h /run --max-depth=1 | sort -rh

Findings:

/var/lib/rancher/rke2/agent → contained bloated container layers & volumes
/run/k3s → contained temporary runtime data (unexpectedly huge)

Root Cause

The issue was primarily caused by:

Applications generating temporary files (images, processing data, etc.)
Files stored directly in ephemeral storage
No cleanup mechanism in place

Resulting Flow:
Application → Temp files → Node filesystem → No cleanup → Disk full → Pod eviction

How RKE2 Manages Storage

Understanding internal storage management is key to resolving capacity issues.

1. Persistent Node Storage

Located at /var/lib/rancher/rke2/. This path contains:

Container images
Writable layers (overlayfs)
Volumes
Cluster data (etcd)

2. Runtime Storage

Located at /run/k3s/. This manages:

Temporary mounts
Sockets
Runtime data

Warning: This directory should remain small. Ongoing growth indicates application misuse.

3. Logs (stdout)

Maintained under /var/log/containers/ and /var/log/pods/.

All container logs flow here
Managed strictly by journald / container runtime

Immediate Recovery Actions

Bash

sudo systemctl restart rke2-agent

Effect: Clears runtime temp storage proactively.

Bash

sudo crictl rmi --prune

Effect: Removes unused images filling up /var/lib.

Bash

sudo crictl pods --state Exited -q | xargs -r crictl rmp

Effect: Cleans stopped containers safely.

Bash

sudo journalctl --vacuum-size=500M

Effect: Reduces system logs footprint.

Troubleshooting Methodology (Reusable)

Step 1: Check disk df -h
Step 2: Find biggest directories du -h / --max-depth=1 | sort -rh
Step 3: Drill down persistent du -h /var/lib/rancher --max-depth=2
Step 4: Check runtime du -h /run
Step 5: Inspect containers crictl ps -a and crictl images

Best Practices (Production Ready)

1. Set Ephemeral Storage Limits

YAML

resources:
  requests:
    ephemeral-storage: "200Mi"
  limits:
    ephemeral-storage: "1Gi"

2. Clean Temporary Files (Application Level)

Always delete files immediately after processing
Avoid completely uncontrolled /tmp usage in code

3. Use Controlled Volumes

YAML

emptyDir:
  sizeLimit: 1Gi

4. Monitoring (Critical)

Deploy standard tools like Prometheus & Grafana to track metrics:

Node disk usage
Container filesystem usage

5. Log Management (stdout)

Configure journald limits inside /etc/systemd/journald.conf:

Config

SystemMaxUse=500M
MaxFileSec=7day

6. Capacity Planning

Minimum 50GB+ per node recommended
Consider placing a separate physical disk layout targeted for /var

Key Takeaways & Conclusion

Kubernetes does NOT manage disk automatically out of the box.
Ephemeral storage is frequently overlooked but ultimately critical for stability.
AI & processing workloads can silently fill up entire disks if untracked.
Without well-defined limits and active cleanup, complete cluster instability is inevitable.

Storage is a first-class resource in Kubernetes.
Proper management requires application discipline (cleanup), Kubernetes configuration (limits), and infrastructure monitoring.

#Kubernetes #RKE2 #DevOps #Cloud #SRE #Observability #Containers #Linux

Hamza Hssaini

Cloud & Kubernetes Enthusiast | DevOps Engineer