Cron Jobs in Kubernetes - connect to existing Pod, execute script

People also ask

How Kubernetes cron job works?

When a CronJob resource is created, what Kubernetes actually does is to register a schedule. Every 10 seconds the CronJob Controller checks if there are matching schedules to take care of. When the proper time arrives a new Job resource is created to handle the task for that specific run.

As far as I'm aware there is no "official" way to do this the way you want, and that is I believe by design. Pods are supposed to be ephemeral and horizontally scalable, and Jobs are designed to exit. Having a cron job "attach" to an existing pod doesn't fit that module. The Scheduler would have no idea if the job completed.

Instead, a Job can to bring up an instance of your application specifically for running the Job and then take it down once the Job is complete. To do this you can use the same Image for the Job as for your Deployment but use a different "Entrypoint" by setting command:.

If they job needs access to data created by your application then that data will need to be persisted outside the application/Pod, you could so this a few ways but the obvious ways would be a database or a persistent volume. For example useing a database would look something like this:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: APP
spec:
  template:
    metadata:
      labels:
        name: THIS
        app: THAT
    spec:
      containers:
        - image: APP:IMAGE
          name: APP
          command:
          - app-start
          env:
            - name: DB_HOST
              value: "127.0.0.1"
            - name: DB_DATABASE
              value: "app_db"

And a job that connects to the same database, but with a different "Entrypoint" :

apiVersion: batch/v1
kind: Job
metadata:
  name: APP-JOB
spec:
  template:
    metadata:
      name: APP-JOB
      labels:
        app: THAT
    spec:
      containers:
      - image: APP:IMAGE
        name: APP-JOB
        command:
        - app-job
        env:
          - name: DB_HOST
            value: "127.0.0.1"
          - name: DB_DATABASE
            value: "app_db"

Or the persistent volume approach would look something like this:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: APP
spec:
  template:
    metadata:
      labels:
        name: THIS
        app: THAT
    spec:
      containers:
        - image: APP:IMAGE
          name: APP
          command:
          - app-start
          volumeMounts:
          - mountPath: "/var/www/html"
            name: APP-VOLUME
      volumes:
        - name:  APP-VOLUME
          persistentVolumeClaim:
            claimName: APP-CLAIM

---

apiVersion: v1
kind: PersistentVolume
metadata:
  name: APP-VOLUME
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  nfs:
    path: /app

---

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: APP-CLAIM
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  selector:
    matchLabels:
      service: app

With a job like this, attaching to the same volume:

apiVersion: batch/v1
kind: Job
metadata:
  name: APP-JOB
spec:
  template:
    metadata:
      name: APP-JOB
      labels:
        app: THAT
    spec:
      containers:
      - image: APP:IMAGE
        name: APP-JOB
        command:
        - app-job
        volumeMounts:
        - mountPath: "/var/www/html"
          name: APP-VOLUME
    volumes:
      - name:  APP-VOLUME
        persistentVolumeClaim:
          claimName: APP-CLAIM

Create a scheduled pod that uses the Kubernetes API to run the command you want on the target pods, via the exec function. The pod image should contain the client libraries to access the API -- many of these are available or you can build your own.

For example, here is a solution using the Python client that execs to each ZooKeeper pod and runs a database maintenance command:

import time

from kubernetes import config
from kubernetes.client import Configuration
from kubernetes.client.apis import core_v1_api
from kubernetes.client.rest import ApiException
from kubernetes.stream import stream
import urllib3

config.load_incluster_config()

configuration = Configuration()
configuration.verify_ssl = False
configuration.assert_hostname = False
urllib3.disable_warnings()
Configuration.set_default(configuration)

api = core_v1_api.CoreV1Api()
label_selector = 'app=zk,tier=backend'
namespace = 'default'

resp = api.list_namespaced_pod(namespace=namespace,
                               label_selector=label_selector)

for x in resp.items:
  name = x.spec.hostname

  resp = api.read_namespaced_pod(name=name,
                                 namespace=namespace)

  exec_command = [
  '/bin/sh',
  '-c',
  'opt/zookeeper/bin/zkCleanup.sh -n 10'
  ]

  resp = stream(api.connect_get_namespaced_pod_exec, name, namespace,
              command=exec_command,
              stderr=True, stdin=False,
              stdout=True, tty=False)

  print("============================ Cleanup %s: ============================\n%s\n" % (name, resp if resp else "<no output>"))

and the associated Dockerfile:

FROM ubuntu:18.04

ADD ./cleanupZk.py /

RUN apt-get update \
  && apt-get install -y python-pip \
  && pip install kubernetes \
  && chmod +x /cleanupZk.py

CMD /cleanupZk.py

Note that if you have an RBAC-enabled cluster, you may need to create a service account and appropriate roles to make this API call possible. A role such as the following is sufficient to list pods and to run exec, such as the example script above requires:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-list-exec
  namespace: default
rules:
  - apiGroups: [""] # "" indicates the core API group
    resources: ["pods"]
    verbs: ["get", "list"]
  - apiGroups: [""] # "" indicates the core API group
    resources: ["pods/exec"]
    verbs: ["create", "get"]

An example of the associated cron job:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: zk-maint
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: zk-maint-pod-list-exec
  namespace: default
subjects:
- kind: ServiceAccount
  name: zk-maint
  namespace: default
roleRef:
  kind: Role
  name: pod-list-exec
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: zk-maint
  namespace: default
  labels:
    app: zk-maint
    tier: jobs
spec:
  schedule: "45 3 * * *"
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: zk-maint
            image: myorg/zkmaint:latest
          serviceAccountName: zk-maint
          restartPolicy: OnFailure
          imagePullSecrets:
          - name: azure-container-registry

This seems like an anti-pattern. Why can't you just run your worker pod as a job pod?

Regardless you seem pretty convinced you need to do this. Here is what I would do.

Take your worker pod and wrap your shell execution in a simple webservice, it's 10 minutes of work with just about any language. Expose the port and put a service in front of that worker/workers. Then your job pods can simply curl ..svc.cluster.local:/ (unless you've futzed with dns).

I managed to do this by creating a custom image with doctl (DigitalOcean's command line interface) and kubectl. The CronJob object would use these two commands to download the cluster configuration and run a command against a container.

Here is a sample CronJob:

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: drupal-cron
spec:
  schedule: "*/5 * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: drupal-cron
              image: juampynr/digital-ocean-cronjob:latest
              env:
                - name: DIGITALOCEAN_ACCESS_TOKEN
                  valueFrom:
                    secretKeyRef:
                      name: api
                      key: key
              command: ["/bin/bash","-c"]
              args:
                - doctl kubernetes cluster kubeconfig save drupster;
                  POD_NAME=$(kubectl get pods -l tier=frontend -o=jsonpath='{.items[0].metadata.name}');
                  kubectl exec $POD_NAME -c drupal -- vendor/bin/drush core:cron;
          restartPolicy: OnFailure

Here is the Docker image that the CronJob uses: https://hub.docker.com/repository/docker/juampynr/digital-ocean-cronjob

If you are not using DigitalOcean, figure out how to download the cluster configuration so kubectl can use it. For example, with Google Cloud, you would have to download gcloud.

Here is the project repository where I implemented this https://github.com/juampynr/drupal8-do.

Related questions
                            
                                Kubernetes Helm stuck with an update in progress
                            
                                Kubernetes rolling deployments and database migrations
                            
                                When I use Deployment in Kubernetes, what's the differences between apps/v1beta1 and extensions/v1beta1?
                            
                                How to completely uninstall Minikube in windows 10 Pro? (chocolatey)
                            
                                What is the equivalent for depends_on in kubernetes
                            
                                How does Kubernetes' scheduler work?
                            
                                Kubectl error: the object has been modified; please apply your changes to the latest version and try again
                            
                                How to specify static IP address for Kubernetes load balancer?
                            
                                Allow scheduling of pods on Kubernetes master?
                            
                                Kubernetes log, User "system:serviceaccount:default:default" cannot get services in the namespace
                            
                                How to upgrade kubectl client version
                            
                                How to copy file from container in a pod in a specific namespace?
                            
                                Pulling images from private registry in Kubernetes
                            
                                How to schedule pods restart
                            
                                How to install a specific Chart version
                            
                                Kubernetes - wait for other pod to be ready
                            
                                why systemd is disabled in WSL?
                            
                                Pod in Kubernetes always in pending state
                            
                                Helm export YAML files locally (just use templating engine, do not send to Kubernetes)
                            
                                Ingress configuration for k8s in different namespaces

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Cron Jobs in Kubernetes - connect to existing Pod, execute script

Tags:

kubernetes

kubernetes-cronjob

People also ask

Recent Activity

Donate For Us