Persistent Volumes and Claims
Understand Kubernetes storage concepts. Work with Persistent Volumes, Persistent Volume Claims, and dynamic provisioning.
Persistent Volumes and Claims
In the Previous Tutorial, we learned about Secrets for managing sensitive data. Now let's tackle a big one — storage.
Containers are ephemeral. When a Pod dies, everything inside it goes poof. Databases, uploaded files, application state — all gone. It's like working on a computer that wipes its hard drive every time you restart.
Yeah, that sounds terrifying.
It is! That's why Persistent Volumes exist. They provide storage that outlives Pods — your data survives even when containers crash and burn.
The Storage Problem
By default, container storage is:
- Ephemeral: Deleted when the container stops (bye-bye data!)
- Isolated: Not shared between containers (unless they're in the same Pod)
- Local: Tied to the node where the Pod runs
For stateful applications (databases, file uploads, caches), you need storage that:
- Survives Pod restarts (duh)
- Can be shared across Pods
- Isn't tied to a specific node
That's a lot of requirements!
It is, and that's exactly why Kubernetes has this whole PV/PVC system. Let's break it down.
Storage Concepts
Kubernetes separates storage into three parts:
┌─────────────────────┐ ┌─────────────────────┐
│ PersistentVolume │ │ Pod │
│ (PV) │ │ ┌───────────────┐ │
│ │◄────│ │VolumeMounts │ │
│ Actual storage │ │ │ /data │ │
│ (disk, NFS, cloud) │ │ └───────────────┘ │
└─────────────────────┘ │ ▲ │
▲ └─────────┼───────────┘
│ │
│ binds │ references
│ │
┌─────────────────────┐ │
│PersistentVolumeClaim│───────────────┘
│ (PVC) │
│ │
│ "I need 10Gi of │
│ ReadWriteOnce │
│ storage" │
└─────────────────────┘
- PersistentVolume (PV): A piece of actual storage in the cluster. Think of it as an available parking spot. Created by admins or dynamically provisioned.
- PersistentVolumeClaim (PVC): A request for storage by a user. "Hey, I need a parking spot that's at least X big."
- StorageClass: Defines types of storage available (fast SSD, slow HDD, network storage). Like choosing between economy and premium parking.
Create a Persistent Volume
Let's create our first PV. In Minikube, we'll use hostPath — storage from the host machine. It's great for learning, but in production, you'd use cloud storage (AWS EBS, GCP PD) or network storage (NFS, Ceph).
Create pv.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /data/my-pv
Key fields:
| Field | Description |
|---|---|
capacity.storage | Size of the volume |
accessModes | How the volume can be mounted |
persistentVolumeReclaimPolicy | What happens when PVC is deleted |
hostPath.path | Path on the host (Minikube only) |
Apply:
kubectl apply -f pv.yaml
Check:
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS AGE
my-pv 1Gi RWO Retain Available 10s
Status is Available — like a parking spot with a "VACANT" sign. Ready to be claimed!
Access Modes
Different storage types support different access patterns:
| Mode | Abbreviation | Description |
|---|---|---|
| ReadWriteOnce | RWO | One node can read/write. Like a private parking spot. |
| ReadOnlyMany | ROX | Many nodes can read. Like a public library book. |
| ReadWriteMany | RWX | Many nodes can read/write. Like a shared Google Doc. |
Not all storage types support all modes. hostPath only supports RWO. Network storage like NFS supports RWX. Choose wisely!
Create a Persistent Volume Claim
Now let's claim some storage. The PVC is your request — Kubernetes finds a matching PV and binds them together.
Create pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
Apply:
kubectl apply -f pvc.yaml
Check:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES AGE
my-pvc Bound my-pv 1Gi RWO 5s
Status is Bound — the PVC found a matching PV and they're connected! It's like a dating app for storage.
Wait, the PV has 1Gi but the PVC only asked for 500Mi. What happens?
The PVC gets the whole PV. Think of it like renting a 2-bedroom apartment when you only needed 1 — you get the whole thing. There's no partial allocation.
Check the PV again:
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM AGE
my-pv 1Gi RWO Retain Bound default/my-pvc 1m
Status changed to Bound with the claim name.
Use PVC in a Pod
Mount the PVC as a volume:
apiVersion: v1
kind: Pod
metadata:
name: app-with-storage
spec:
containers:
- name: app
image: nginx
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: my-pvc
Apply:
kubectl apply -f app-pod.yaml
Write data to the volume:
kubectl exec app-with-storage -- sh -c 'echo "Hello from persistent storage!" > /usr/share/nginx/html/index.html'
Verify:
kubectl exec app-with-storage -- cat /usr/share/nginx/html/index.html
Hello from persistent storage!
Test Persistence
Now for the moment of truth — let's prove the data actually survives!
Delete the Pod:
kubectl delete pod app-with-storage
Recreate it:
kubectl apply -f app-pod.yaml
Check the data:
kubectl exec app-with-storage -- cat /usr/share/nginx/html/index.html
Hello from persistent storage!
The data survived the Pod deletion! The Pod died and was reborn, but the data remained. That's the whole point of persistent storage.
Reclaim Policies
What happens to the PV when you delete its PVC? This is important — it can be the difference between "oops" and "OOPS."
| Policy | Behavior |
|---|---|
Retain | PV remains with data intact. Must be manually cleaned up. |
Delete | PV and underlying storage are deleted. |
Recycle | Deprecated. Data is deleted, PV made available again. |
For production databases, always use Retain. You don't want an accidental PVC deletion to nuke your data. Trust me on this one.
Dynamic Provisioning
Manually creating PVs doesn't scale. Imagine having to create a PV every time someone wants storage — it's like being a hotel receptionist who has to build each room before a guest checks in.
Dynamic provisioning creates PVs automatically when you create a PVC. Much better!
Storage Classes
StorageClasses define how to provision storage dynamically.
Check available classes:
kubectl get storageclass
In Minikube:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE
standard (default) k8s.io/minikube-hostpath Delete Immediate
Create PVC with StorageClass
Create dynamic-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dynamic-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard # Use Minikube's default class
resources:
requests:
storage: 1Gi
Apply:
kubectl apply -f dynamic-pvc.yaml
A PV is created automatically:
kubectl get pv,pvc
NAME CAPACITY ACCESS MODES STATUS CLAIM
persistentvolume/pvc-abc123-xyz789 1Gi RWO Bound default/dynamic-pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES
persistentvolumeclaim/dynamic-pvc Bound pvc-abc123-xyz789 1Gi RWO
No manual PV creation needed. Kubernetes handled everything. How cool is that?
Cloud StorageClasses
In cloud environments, you get storage classes for different disk types:
AWS EKS:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
type: gp3
iops: "3000"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
GCP GKE:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ssd
provisioner: pd.csi.storage.gke.io
parameters:
type: pd-ssd
Practical Example: MySQL with Persistent Storage
Let's do something real — deploy a MySQL database that doesn't lose its data when the Pod restarts. Because a database that forgets everything is just a fancy /dev/null.
mysql-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
mysql-deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mysql
spec:
replicas: 1
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8
env:
- name: MYSQL_ROOT_PASSWORD
value: rootpassword
ports:
- containerPort: 3306
volumeMounts:
- name: mysql-data
mountPath: /var/lib/mysql
volumes:
- name: mysql-data
persistentVolumeClaim:
claimName: mysql-pvc
Apply:
kubectl apply -f mysql-pvc.yaml
kubectl apply -f mysql-deployment.yaml
Now MySQL data persists across Pod restarts. Your data is safe. You can sleep at night.
EmptyDir: Temporary Shared Storage
Sometimes you don't need persistent storage, just shared storage between containers in the same Pod. That's what emptyDir is for — it's like a whiteboard that gets erased when the Pod is deleted:
apiVersion: v1
kind: Pod
metadata:
name: shared-storage
spec:
containers:
- name: writer
image: busybox
command: ["sh", "-c", "while true; do date >> /data/log.txt; sleep 5; done"]
volumeMounts:
- name: shared
mountPath: /data
- name: reader
image: busybox
command: ["sh", "-c", "tail -f /data/log.txt"]
volumeMounts:
- name: shared
mountPath: /data
volumes:
- name: shared
emptyDir: {}
Both containers share /data. When the Pod is deleted, the data disappears. It's temporary — don't store anything important here!
Troubleshooting
PVC Stuck in Pending
This is probably the most common storage headache:
kubectl describe pvc my-pvc
Common causes:
- No matching PV available (wrong size or access mode)
- StorageClass doesn't exist (check the name!)
- Insufficient storage capacity
Pod Can't Mount Volume
Another classic:
kubectl describe pod <pod-name>
Common causes:
- PVC not bound yet (check PVC status first)
- Volume already mounted by another Pod (RWO mode — only one node at a time!)
- Node doesn't have access to storage
Clean Up
kubectl delete pod app-with-storage shared-storage 2>/dev/null
kubectl delete deployment mysql 2>/dev/null
kubectl delete pvc my-pvc dynamic-pvc mysql-pvc 2>/dev/null
kubectl delete pv my-pv 2>/dev/null
What's Next?
Nice — your apps can now persist data! But how do you make sure one greedy container doesn't eat up all the CPU and memory on a node, starving everyone else?
In the next tutorial, you'll learn about Resource Limits and Requests — setting boundaries so your containers play nice with each other. Let's go!