Jobs and CronJobs
Run batch workloads and scheduled tasks in Kubernetes using Jobs and CronJobs.
Jobs and CronJobs
In the previous tutorial, we set up health checks so Kubernetes can detect and recover from application failures. Now let's talk about a different kind of workload.
Not everything is a long-running service that runs 24/7. Sometimes you just need to run a task once and exit. Like a database migration. Or a batch of emails. Or that script you "just need to run real quick" at 3 AM.
Jobs handle one-time tasks. CronJobs handle scheduled tasks. Think of Jobs as "do this thing" and CronJobs as "do this thing every Tuesday at midnight."
Jobs
A Job creates one or more Pods and ensures they run to completion. Unlike Deployments (which keep Pods running forever like a clingy ex), Jobs are for finite tasks. Do the thing, confirm it worked, done.
Use cases:
- Database migrations
- Batch processing
- Report generation
- Data import/export
- One-time scripts (that you'll probably run "one time" twelve times)
Create a Simple Job
apiVersion: batch/v1
kind: Job
metadata:
name: hello-job
spec:
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello from Kubernetes Job!"]
restartPolicy: Never
Key differences from Pods/Deployments:
restartPolicymust beNeverorOnFailure(notAlways— that would defeat the purpose)- No
replicas— Jobs have different completion semantics
Apply it:
kubectl apply -f hello-job.yaml
Check status:
kubectl get jobs
NAME COMPLETIONS DURATION AGE
hello-job 1/1 5s 30s
COMPLETIONS shows 1/1 — the job completed successfully. One and done!
View Job Pods
Jobs create Pods with auto-generated names (they're not picky about naming):
kubectl get pods
NAME READY STATUS RESTARTS AGE
hello-job-abc12 0/1 Completed 0 1m
View the output:
kubectl logs hello-job-abc12
Hello from Kubernetes Job!
Job Completion and Parallelism
"What if I need to run the same task multiple times, like processing a queue?"
Glad you asked! You can control how Jobs execute:
apiVersion: batch/v1
kind: Job
metadata:
name: parallel-job
spec:
completions: 5 # Run 5 times total
parallelism: 2 # Run 2 at a time
template:
spec:
containers:
- name: worker
image: busybox
command: ["sh", "-c", "echo Processing item $RANDOM; sleep 5"]
restartPolicy: Never
| Field | Description |
|---|---|
completions | Total successful completions needed |
parallelism | Maximum Pods running simultaneously |
This runs 5 Pods total, 2 at a time. Like a factory with 2 machines processing 5 orders.
Backoff and Retries
"What if my Job fails?"
Kubernetes doesn't give up easily. It retries with exponential backoff:
apiVersion: batch/v1
kind: Job
metadata:
name: flaky-job
spec:
backoffLimit: 4 # Retry up to 4 times
template:
spec:
containers:
- name: flaky
image: busybox
command: ["sh", "-c", "exit 1"] # Always fails
restartPolicy: Never
After 4 failures, the Job is marked as failed. Kubernetes tried its best. The backoff increases exponentially (10s, 20s, 40s...) because spamming retries is nobody's idea of a good time.
Active Deadline
Don't want a Job running forever? Set a maximum runtime:
spec:
activeDeadlineSeconds: 300 # Kill after 5 minutes
If the Job runs longer than this, all Pods are terminated and the Job is marked failed. It's the "you have 5 minutes" timer.
TTL After Finished
Automatically clean up completed Jobs (because nobody wants zombie Jobs cluttering up the cluster):
spec:
ttlSecondsAfterFinished: 3600 # Delete 1 hour after completion
Without this, completed Jobs stick around until manually deleted.
Practical Job Example: Database Migration
Here's a real-world example — the kind of Job you'll actually write:
apiVersion: batch/v1
kind: Job
metadata:
name: db-migrate
spec:
backoffLimit: 3
activeDeadlineSeconds: 600
ttlSecondsAfterFinished: 86400
template:
spec:
containers:
- name: migrate
image: myapp:latest
command: ["./migrate.sh"]
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
restartPolicy: OnFailure
Features:
- Retries up to 3 times on failure (because migrations are flaky sometimes)
- Fails if not completed in 10 minutes (if it takes longer than that, something is very wrong)
- Auto-deletes after 24 hours (no zombie Jobs)
- Uses secrets for database URL (because hardcoding passwords is a crime)
CronJobs
"Okay, but what if I need to run something on a schedule?"
CronJobs! They're like the Unix cron you know and (maybe) love, but in Kubernetes.
Create a CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello-cron
spec:
schedule: "*/5 * * * *" # Every 5 minutes
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
command: ["echo", "Hello from CronJob!"]
restartPolicy: OnFailure
Cron Schedule Format
If you've ever used cron, this is the same format. If you haven't... well, memorize this diagram and you'll be fine:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday = 0)
│ │ │ │ │
* * * * *
Common schedules:
| Schedule | Meaning |
|---|---|
*/15 * * * * | Every 15 minutes |
0 * * * * | Every hour |
0 0 * * * | Every day at midnight |
0 0 * * 0 | Every Sunday at midnight |
0 0 1 * * | First day of every month |
0 9 * * 1-5 | 9 AM on weekdays |
Pro tip: Use crontab.guru to validate your cron expressions. Because nobody gets these right on the first try.
Apply:
kubectl apply -f hello-cron.yaml
Check status:
kubectl get cronjobs
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
hello-cron */5 * * * * False 0 2m 10m
View Jobs created by the CronJob:
kubectl get jobs
NAME COMPLETIONS DURATION AGE
hello-cron-1699900200 1/1 3s 5m
hello-cron-1699900500 1/1 3s <invalid>
It's creating Jobs automatically on schedule. How cool is that?
CronJob Options
Here's a CronJob with all the bells and whistles:
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-cronjob
spec:
schedule: "0 2 * * *" # 2 AM daily
concurrencyPolicy: Forbid # Don't run if previous is still running
successfulJobsHistoryLimit: 3 # Keep last 3 successful Jobs
failedJobsHistoryLimit: 1 # Keep last failed Job
startingDeadlineSeconds: 200 # Must start within 200s of scheduled time
suspend: false # Set true to pause the CronJob
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: backup-tool:latest
command: ["./backup.sh"]
restartPolicy: OnFailure
Concurrency Policies
"What happens if the previous Job is still running when the next one is scheduled?"
Great question! You control this with concurrency policies:
| Policy | Behavior |
|---|---|
Allow | Default. Multiple Jobs can run concurrently |
Forbid | Skip new Job if previous is still running |
Replace | Cancel running Job and start new one |
Use Forbid for most tasks to prevent overlap. You don't want two backup Jobs fighting over the same database.
Suspend a CronJob
Need to temporarily pause scheduling? Maybe for maintenance?
kubectl patch cronjob hello-cron -p '{"spec":{"suspend":true}}'
Resume:
kubectl patch cronjob hello-cron -p '{"spec":{"suspend":false}}'
Manually Trigger a CronJob
"Can I run a CronJob right now without waiting for the schedule?"
Absolutely! Create a Job from a CronJob immediately:
kubectl create job --from=cronjob/hello-cron manual-hello
Practical CronJob Example: Database Backup
This is the kind of CronJob that will save your bacon one day. A nightly PostgreSQL backup:
apiVersion: batch/v1
kind: CronJob
metadata:
name: postgres-backup
spec:
schedule: "0 3 * * *" # 3 AM daily
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 7
failedJobsHistoryLimit: 3
jobTemplate:
spec:
backoffLimit: 2
activeDeadlineSeconds: 3600
template:
spec:
containers:
- name: backup
image: postgres:15
command:
- /bin/sh
- -c
- |
pg_dump -h $DB_HOST -U $DB_USER $DB_NAME | gzip > /backup/db-$(date +%Y%m%d).sql.gz
env:
- name: DB_HOST
value: postgres-service
- name: DB_USER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: username
- name: DB_NAME
value: myapp
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: password
volumeMounts:
- name: backup
mountPath: /backup
restartPolicy: OnFailure
volumes:
- name: backup
persistentVolumeClaim:
claimName: backup-pvc
That's a production-ready backup job. It runs every night at 3 AM, keeps 7 successful backups (one week's worth), retries twice if something goes wrong, and stores backups on a persistent volume. Your future self will love you for this.
Monitoring Jobs
Job Status
kubectl describe job <job-name>
Key sections:
Completions: How many Pods completed successfullyParallelism: How many run at onceEvents: Start, completion, or failure events
Job Conditions
kubectl get job <job-name> -o jsonpath='{.status.conditions}'
Conditions:
Complete: Job finished successfullyFailed: Job failed (backoff limit reached or deadline exceeded)
CronJob Status
kubectl describe cronjob <cronjob-name>
See:
- Last schedule time
- Active Jobs
- History of Jobs
Troubleshooting
When Jobs go sideways, here's your debugging playbook.
Job Never Completes
Check Pod status and logs:
kubectl get pods -l job-name=<job-name>
kubectl logs <pod-name>
kubectl describe pod <pod-name>
Common issues:
- Command fails (check exit code —
exit 1means something broke) - Image pull errors (typo in the image name, we've all been there)
- Resource limits too low (Job gets OOM killed before finishing)
CronJob Doesn't Run
kubectl describe cronjob <cronjob-name>
Check:
- Is it suspended? (the
pausebutton we showed earlier) - Is the schedule correct? (cron syntax is tricky, use crontab.guru)
- Check
lastScheduleTime— has it ever run?
Jobs Piling Up
Old Jobs not being cleaned up? Your cluster is collecting garbage:
kubectl get jobs
Solutions:
- Set
ttlSecondsAfterFinished - Set
successfulJobsHistoryLimitandfailedJobsHistoryLimit - Manual cleanup:
kubectl delete jobs --field-selector status.successful=1
Clean Up
kubectl delete job hello-job parallel-job flaky-job 2>/dev/null
kubectl delete cronjob hello-cron postgres-backup 2>/dev/null
What's Next?
Awesome work! You now know how to run both one-time tasks and scheduled workloads in Kubernetes. No more SSH-ing into servers to run cron jobs manually. Welcome to the future.
But what about exposing your applications to the outside world with proper HTTP routing? In the next tutorial, we'll dive into Ingress Controllers — the front door to your cluster that handles routing, SSL termination, and much more. Let's go!