Jobs and CronJobs

Run batch workloads and scheduled tasks in Kubernetes using Jobs and CronJobs.

9 min read

Jobs and CronJobs

In the previous tutorial, we set up health checks so Kubernetes can detect and recover from application failures. Now let's talk about a different kind of workload.

Not everything is a long-running service that runs 24/7. Sometimes you just need to run a task once and exit. Like a database migration. Or a batch of emails. Or that script you "just need to run real quick" at 3 AM.

Jobs handle one-time tasks. CronJobs handle scheduled tasks. Think of Jobs as "do this thing" and CronJobs as "do this thing every Tuesday at midnight."

Jobs

A Job creates one or more Pods and ensures they run to completion. Unlike Deployments (which keep Pods running forever like a clingy ex), Jobs are for finite tasks. Do the thing, confirm it worked, done.

Use cases:

  • Database migrations
  • Batch processing
  • Report generation
  • Data import/export
  • One-time scripts (that you'll probably run "one time" twelve times)

Create a Simple Job

apiVersion: batch/v1
kind: Job
metadata:
  name: hello-job
spec:
  template:
    spec:
      containers:
      - name: hello
        image: busybox
        command: ["echo", "Hello from Kubernetes Job!"]
      restartPolicy: Never

Key differences from Pods/Deployments:

  • restartPolicy must be Never or OnFailure (not Always — that would defeat the purpose)
  • No replicas — Jobs have different completion semantics

Apply it:

kubectl apply -f hello-job.yaml

Check status:

kubectl get jobs
NAME        COMPLETIONS   DURATION   AGE
hello-job   1/1           5s         30s

COMPLETIONS shows 1/1 — the job completed successfully. One and done!

View Job Pods

Jobs create Pods with auto-generated names (they're not picky about naming):

kubectl get pods
NAME              READY   STATUS      RESTARTS   AGE
hello-job-abc12   0/1     Completed   0          1m

View the output:

kubectl logs hello-job-abc12
Hello from Kubernetes Job!

Job Completion and Parallelism

"What if I need to run the same task multiple times, like processing a queue?"

Glad you asked! You can control how Jobs execute:

apiVersion: batch/v1
kind: Job
metadata:
  name: parallel-job
spec:
  completions: 5      # Run 5 times total
  parallelism: 2      # Run 2 at a time
  template:
    spec:
      containers:
      - name: worker
        image: busybox
        command: ["sh", "-c", "echo Processing item $RANDOM; sleep 5"]
      restartPolicy: Never
FieldDescription
completionsTotal successful completions needed
parallelismMaximum Pods running simultaneously

This runs 5 Pods total, 2 at a time. Like a factory with 2 machines processing 5 orders.

Backoff and Retries

"What if my Job fails?"

Kubernetes doesn't give up easily. It retries with exponential backoff:

apiVersion: batch/v1
kind: Job
metadata:
  name: flaky-job
spec:
  backoffLimit: 4     # Retry up to 4 times
  template:
    spec:
      containers:
      - name: flaky
        image: busybox
        command: ["sh", "-c", "exit 1"]  # Always fails
      restartPolicy: Never

After 4 failures, the Job is marked as failed. Kubernetes tried its best. The backoff increases exponentially (10s, 20s, 40s...) because spamming retries is nobody's idea of a good time.

Active Deadline

Don't want a Job running forever? Set a maximum runtime:

spec:
  activeDeadlineSeconds: 300  # Kill after 5 minutes

If the Job runs longer than this, all Pods are terminated and the Job is marked failed. It's the "you have 5 minutes" timer.

TTL After Finished

Automatically clean up completed Jobs (because nobody wants zombie Jobs cluttering up the cluster):

spec:
  ttlSecondsAfterFinished: 3600  # Delete 1 hour after completion

Without this, completed Jobs stick around until manually deleted.

Practical Job Example: Database Migration

Here's a real-world example — the kind of Job you'll actually write:

apiVersion: batch/v1
kind: Job
metadata:
  name: db-migrate
spec:
  backoffLimit: 3
  activeDeadlineSeconds: 600
  ttlSecondsAfterFinished: 86400
  template:
    spec:
      containers:
      - name: migrate
        image: myapp:latest
        command: ["./migrate.sh"]
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
      restartPolicy: OnFailure

Features:

  • Retries up to 3 times on failure (because migrations are flaky sometimes)
  • Fails if not completed in 10 minutes (if it takes longer than that, something is very wrong)
  • Auto-deletes after 24 hours (no zombie Jobs)
  • Uses secrets for database URL (because hardcoding passwords is a crime)

CronJobs

"Okay, but what if I need to run something on a schedule?"

CronJobs! They're like the Unix cron you know and (maybe) love, but in Kubernetes.

Create a CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello-cron
spec:
  schedule: "*/5 * * * *"  # Every 5 minutes
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            command: ["echo", "Hello from CronJob!"]
          restartPolicy: OnFailure

Cron Schedule Format

If you've ever used cron, this is the same format. If you haven't... well, memorize this diagram and you'll be fine:

┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of the month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday = 0)
│ │ │ │ │
* * * * *

Common schedules:

ScheduleMeaning
*/15 * * * *Every 15 minutes
0 * * * *Every hour
0 0 * * *Every day at midnight
0 0 * * 0Every Sunday at midnight
0 0 1 * *First day of every month
0 9 * * 1-59 AM on weekdays

Pro tip: Use crontab.guru to validate your cron expressions. Because nobody gets these right on the first try.

Apply:

kubectl apply -f hello-cron.yaml

Check status:

kubectl get cronjobs
NAME         SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
hello-cron   */5 * * * *   False     0        2m              10m

View Jobs created by the CronJob:

kubectl get jobs
NAME                    COMPLETIONS   DURATION   AGE
hello-cron-1699900200   1/1           3s         5m
hello-cron-1699900500   1/1           3s         <invalid>

It's creating Jobs automatically on schedule. How cool is that?

CronJob Options

Here's a CronJob with all the bells and whistles:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: backup-cronjob
spec:
  schedule: "0 2 * * *"              # 2 AM daily
  concurrencyPolicy: Forbid          # Don't run if previous is still running
  successfulJobsHistoryLimit: 3      # Keep last 3 successful Jobs
  failedJobsHistoryLimit: 1          # Keep last failed Job
  startingDeadlineSeconds: 200       # Must start within 200s of scheduled time
  suspend: false                     # Set true to pause the CronJob
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: backup-tool:latest
            command: ["./backup.sh"]
          restartPolicy: OnFailure

Concurrency Policies

"What happens if the previous Job is still running when the next one is scheduled?"

Great question! You control this with concurrency policies:

PolicyBehavior
AllowDefault. Multiple Jobs can run concurrently
ForbidSkip new Job if previous is still running
ReplaceCancel running Job and start new one

Use Forbid for most tasks to prevent overlap. You don't want two backup Jobs fighting over the same database.

Suspend a CronJob

Need to temporarily pause scheduling? Maybe for maintenance?

kubectl patch cronjob hello-cron -p '{"spec":{"suspend":true}}'

Resume:

kubectl patch cronjob hello-cron -p '{"spec":{"suspend":false}}'

Manually Trigger a CronJob

"Can I run a CronJob right now without waiting for the schedule?"

Absolutely! Create a Job from a CronJob immediately:

kubectl create job --from=cronjob/hello-cron manual-hello

Practical CronJob Example: Database Backup

This is the kind of CronJob that will save your bacon one day. A nightly PostgreSQL backup:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
spec:
  schedule: "0 3 * * *"  # 3 AM daily
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 7
  failedJobsHistoryLimit: 3
  jobTemplate:
    spec:
      backoffLimit: 2
      activeDeadlineSeconds: 3600
      template:
        spec:
          containers:
          - name: backup
            image: postgres:15
            command:
            - /bin/sh
            - -c
            - |
              pg_dump -h $DB_HOST -U $DB_USER $DB_NAME | gzip > /backup/db-$(date +%Y%m%d).sql.gz
            env:
            - name: DB_HOST
              value: postgres-service
            - name: DB_USER
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: username
            - name: DB_NAME
              value: myapp
            - name: PGPASSWORD
              valueFrom:
                secretKeyRef:
                  name: postgres-credentials
                  key: password
            volumeMounts:
            - name: backup
              mountPath: /backup
          restartPolicy: OnFailure
          volumes:
          - name: backup
            persistentVolumeClaim:
              claimName: backup-pvc

That's a production-ready backup job. It runs every night at 3 AM, keeps 7 successful backups (one week's worth), retries twice if something goes wrong, and stores backups on a persistent volume. Your future self will love you for this.

Monitoring Jobs

Job Status

kubectl describe job <job-name>

Key sections:

  • Completions: How many Pods completed successfully
  • Parallelism: How many run at once
  • Events: Start, completion, or failure events

Job Conditions

kubectl get job <job-name> -o jsonpath='{.status.conditions}'

Conditions:

  • Complete: Job finished successfully
  • Failed: Job failed (backoff limit reached or deadline exceeded)

CronJob Status

kubectl describe cronjob <cronjob-name>

See:

  • Last schedule time
  • Active Jobs
  • History of Jobs

Troubleshooting

When Jobs go sideways, here's your debugging playbook.

Job Never Completes

Check Pod status and logs:

kubectl get pods -l job-name=<job-name>
kubectl logs <pod-name>
kubectl describe pod <pod-name>

Common issues:

  • Command fails (check exit code — exit 1 means something broke)
  • Image pull errors (typo in the image name, we've all been there)
  • Resource limits too low (Job gets OOM killed before finishing)

CronJob Doesn't Run

kubectl describe cronjob <cronjob-name>

Check:

  • Is it suspended? (the pause button we showed earlier)
  • Is the schedule correct? (cron syntax is tricky, use crontab.guru)
  • Check lastScheduleTime — has it ever run?

Jobs Piling Up

Old Jobs not being cleaned up? Your cluster is collecting garbage:

kubectl get jobs

Solutions:

  • Set ttlSecondsAfterFinished
  • Set successfulJobsHistoryLimit and failedJobsHistoryLimit
  • Manual cleanup: kubectl delete jobs --field-selector status.successful=1

Clean Up

kubectl delete job hello-job parallel-job flaky-job 2>/dev/null
kubectl delete cronjob hello-cron postgres-backup 2>/dev/null

What's Next?

Awesome work! You now know how to run both one-time tasks and scheduled workloads in Kubernetes. No more SSH-ing into servers to run cron jobs manually. Welcome to the future.

But what about exposing your applications to the outside world with proper HTTP routing? In the next tutorial, we'll dive into Ingress Controllers — the front door to your cluster that handles routing, SSL termination, and much more. Let's go!