Persistent Volumes and Claims

Understand Kubernetes storage concepts. Work with Persistent Volumes, Persistent Volume Claims, and dynamic provisioning.

9 min read

Persistent Volumes and Claims

In the Previous Tutorial, we learned about Secrets for managing sensitive data. Now let's tackle a big one — storage.

Containers are ephemeral. When a Pod dies, everything inside it goes poof. Databases, uploaded files, application state — all gone. It's like working on a computer that wipes its hard drive every time you restart.

Yeah, that sounds terrifying.

It is! That's why Persistent Volumes exist. They provide storage that outlives Pods — your data survives even when containers crash and burn.

The Storage Problem

By default, container storage is:

  • Ephemeral: Deleted when the container stops (bye-bye data!)
  • Isolated: Not shared between containers (unless they're in the same Pod)
  • Local: Tied to the node where the Pod runs

For stateful applications (databases, file uploads, caches), you need storage that:

  • Survives Pod restarts (duh)
  • Can be shared across Pods
  • Isn't tied to a specific node

That's a lot of requirements!

It is, and that's exactly why Kubernetes has this whole PV/PVC system. Let's break it down.

Storage Concepts

Kubernetes separates storage into three parts:

┌─────────────────────┐     ┌─────────────────────┐
│  PersistentVolume   │     │       Pod           │
│  (PV)               │     │  ┌───────────────┐  │
│                     │◄────│  │VolumeMounts   │  │
│  Actual storage     │     │  │ /data         │  │
│  (disk, NFS, cloud) │     │  └───────────────┘  │
└─────────────────────┘     │         ▲           │
          ▲                 └─────────┼───────────┘
          │                           │
          │ binds                     │ references
          │                           │
┌─────────────────────┐               │
│PersistentVolumeClaim│───────────────┘
│  (PVC)              │
│                     │
│  "I need 10Gi of    │
│   ReadWriteOnce     │
│   storage"          │
└─────────────────────┘
  • PersistentVolume (PV): A piece of actual storage in the cluster. Think of it as an available parking spot. Created by admins or dynamically provisioned.
  • PersistentVolumeClaim (PVC): A request for storage by a user. "Hey, I need a parking spot that's at least X big."
  • StorageClass: Defines types of storage available (fast SSD, slow HDD, network storage). Like choosing between economy and premium parking.

Create a Persistent Volume

Let's create our first PV. In Minikube, we'll use hostPath — storage from the host machine. It's great for learning, but in production, you'd use cloud storage (AWS EBS, GCP PD) or network storage (NFS, Ceph).

Create pv.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  hostPath:
    path: /data/my-pv

Key fields:

FieldDescription
capacity.storageSize of the volume
accessModesHow the volume can be mounted
persistentVolumeReclaimPolicyWhat happens when PVC is deleted
hostPath.pathPath on the host (Minikube only)

Apply:

kubectl apply -f pv.yaml

Check:

kubectl get pv
NAME    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      AGE
my-pv   1Gi        RWO            Retain           Available   10s

Status is Available — like a parking spot with a "VACANT" sign. Ready to be claimed!

Access Modes

Different storage types support different access patterns:

ModeAbbreviationDescription
ReadWriteOnceRWOOne node can read/write. Like a private parking spot.
ReadOnlyManyROXMany nodes can read. Like a public library book.
ReadWriteManyRWXMany nodes can read/write. Like a shared Google Doc.

Not all storage types support all modes. hostPath only supports RWO. Network storage like NFS supports RWX. Choose wisely!

Create a Persistent Volume Claim

Now let's claim some storage. The PVC is your request — Kubernetes finds a matching PV and binds them together.

Create pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 500Mi

Apply:

kubectl apply -f pvc.yaml

Check:

kubectl get pvc
NAME     STATUS   VOLUME   CAPACITY   ACCESS MODES   AGE
my-pvc   Bound    my-pv    1Gi        RWO            5s

Status is Bound — the PVC found a matching PV and they're connected! It's like a dating app for storage.

Wait, the PV has 1Gi but the PVC only asked for 500Mi. What happens?

The PVC gets the whole PV. Think of it like renting a 2-bedroom apartment when you only needed 1 — you get the whole thing. There's no partial allocation.

Check the PV again:

kubectl get pv
NAME    CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM            AGE
my-pv   1Gi        RWO            Retain           Bound    default/my-pvc   1m

Status changed to Bound with the claim name.

Use PVC in a Pod

Mount the PVC as a volume:

apiVersion: v1
kind: Pod
metadata:
  name: app-with-storage
spec:
  containers:
  - name: app
    image: nginx
    volumeMounts:
    - name: data
      mountPath: /usr/share/nginx/html
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: my-pvc

Apply:

kubectl apply -f app-pod.yaml

Write data to the volume:

kubectl exec app-with-storage -- sh -c 'echo "Hello from persistent storage!" > /usr/share/nginx/html/index.html'

Verify:

kubectl exec app-with-storage -- cat /usr/share/nginx/html/index.html
Hello from persistent storage!

Test Persistence

Now for the moment of truth — let's prove the data actually survives!

Delete the Pod:

kubectl delete pod app-with-storage

Recreate it:

kubectl apply -f app-pod.yaml

Check the data:

kubectl exec app-with-storage -- cat /usr/share/nginx/html/index.html
Hello from persistent storage!

The data survived the Pod deletion! The Pod died and was reborn, but the data remained. That's the whole point of persistent storage.

Reclaim Policies

What happens to the PV when you delete its PVC? This is important — it can be the difference between "oops" and "OOPS."

PolicyBehavior
RetainPV remains with data intact. Must be manually cleaned up.
DeletePV and underlying storage are deleted.
RecycleDeprecated. Data is deleted, PV made available again.

For production databases, always use Retain. You don't want an accidental PVC deletion to nuke your data. Trust me on this one.

Dynamic Provisioning

Manually creating PVs doesn't scale. Imagine having to create a PV every time someone wants storage — it's like being a hotel receptionist who has to build each room before a guest checks in.

Dynamic provisioning creates PVs automatically when you create a PVC. Much better!

Storage Classes

StorageClasses define how to provision storage dynamically.

Check available classes:

kubectl get storageclass

In Minikube:

NAME                 PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE
standard (default)   k8s.io/minikube-hostpath   Delete          Immediate

Create PVC with StorageClass

Create dynamic-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dynamic-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: standard  # Use Minikube's default class
  resources:
    requests:
      storage: 1Gi

Apply:

kubectl apply -f dynamic-pvc.yaml

A PV is created automatically:

kubectl get pv,pvc
NAME                                       CAPACITY   ACCESS MODES   STATUS   CLAIM
persistentvolume/pvc-abc123-xyz789         1Gi        RWO            Bound    default/dynamic-pvc

NAME                          STATUS   VOLUME                CAPACITY   ACCESS MODES
persistentvolumeclaim/dynamic-pvc   Bound    pvc-abc123-xyz789     1Gi        RWO

No manual PV creation needed. Kubernetes handled everything. How cool is that?

Cloud StorageClasses

In cloud environments, you get storage classes for different disk types:

AWS EKS:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com
parameters:
  type: gp3
  iops: "3000"
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

GCP GKE:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ssd
provisioner: pd.csi.storage.gke.io
parameters:
  type: pd-ssd

Practical Example: MySQL with Persistent Storage

Let's do something real — deploy a MySQL database that doesn't lose its data when the Pod restarts. Because a database that forgets everything is just a fancy /dev/null.

mysql-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

mysql-deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: rootpassword
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: mysql-data
          mountPath: /var/lib/mysql
      volumes:
      - name: mysql-data
        persistentVolumeClaim:
          claimName: mysql-pvc

Apply:

kubectl apply -f mysql-pvc.yaml
kubectl apply -f mysql-deployment.yaml

Now MySQL data persists across Pod restarts. Your data is safe. You can sleep at night.

EmptyDir: Temporary Shared Storage

Sometimes you don't need persistent storage, just shared storage between containers in the same Pod. That's what emptyDir is for — it's like a whiteboard that gets erased when the Pod is deleted:

apiVersion: v1
kind: Pod
metadata:
  name: shared-storage
spec:
  containers:
  - name: writer
    image: busybox
    command: ["sh", "-c", "while true; do date >> /data/log.txt; sleep 5; done"]
    volumeMounts:
    - name: shared
      mountPath: /data
  - name: reader
    image: busybox
    command: ["sh", "-c", "tail -f /data/log.txt"]
    volumeMounts:
    - name: shared
      mountPath: /data
  volumes:
  - name: shared
    emptyDir: {}

Both containers share /data. When the Pod is deleted, the data disappears. It's temporary — don't store anything important here!

Troubleshooting

PVC Stuck in Pending

This is probably the most common storage headache:

kubectl describe pvc my-pvc

Common causes:

  • No matching PV available (wrong size or access mode)
  • StorageClass doesn't exist (check the name!)
  • Insufficient storage capacity

Pod Can't Mount Volume

Another classic:

kubectl describe pod <pod-name>

Common causes:

  • PVC not bound yet (check PVC status first)
  • Volume already mounted by another Pod (RWO mode — only one node at a time!)
  • Node doesn't have access to storage

Clean Up

kubectl delete pod app-with-storage shared-storage 2>/dev/null
kubectl delete deployment mysql 2>/dev/null
kubectl delete pvc my-pvc dynamic-pvc mysql-pvc 2>/dev/null
kubectl delete pv my-pv 2>/dev/null

What's Next?

Nice — your apps can now persist data! But how do you make sure one greedy container doesn't eat up all the CPU and memory on a node, starving everyone else?

In the next tutorial, you'll learn about Resource Limits and Requests — setting boundaries so your containers play nice with each other. Let's go!