Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubernetes CPU throttling with CPU usage well below requests/limits

I have setup CPU and Memory Requests=Limits on all containers of my pod in order to qualify it for Guaranteed Quality of Service class. Now, look at these CPU Usage and CPU Throttling graphs for same Pod for last 6 hours.

CPU Utilization and Throttling for a pod over 6 hrs period

Does this look normal and expected?

CPU Usage has not even touched 50% of the set limit a single time and still it was being throttled upto 58% at times.

And a side question, what does that red line at 25% in the Throttling graph indicates?

I did some research on this topic and found that there was a bug in Linux kernel that could have caused this and that it was fixed in version 4.18 of the kernel. Reference: this and this

We are on GKE running Container Optimized OS by Google. I checked the linux kernel version on our nodes and they are on 4.19.112+ so I guess we already have that patch? What else could be the reason of this throttling pattern?

P.S. This pod (actually a deployment with autoscaling) is deployed on a separate Node pool which has none of our other workloads running on it. So the only pods other than this deployment running on Nodes in this node pool are some metrics and logging agents and exporters. Here is the full list of pods running on the same Node at which the pod discussed above is scheduled. There are indeed some pods that don't have any CPU limits set on them. Do I need to somehow set CPU limits on these as well?

Pods running on a Node

Our GKE version is 1.16.9-gke.2

Here is the manifest file containing deployment, service, and auto scaler definitions.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: endpoints
  labels:
    app: endpoints
spec:
  replicas: 2
  selector:
    matchLabels:
      run: endpoints
  strategy:
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 0
  template:
    metadata:
      labels:
        run: endpoints
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: endpoints
          image: gcr.io/<PROJECT_ID>/endpoints:<RELEASE_VERSION_PLACEHOLDER>
          livenessProbe:
            httpGet:
              path: /probes/live
              port: 8080
            initialDelaySeconds: 20
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /probes/ready
              port: 8080
            initialDelaySeconds: 20
            timeoutSeconds: 5
          ports:
            - containerPort: 8080
              protocol: TCP
          env:
            - name: GOOGLE_APPLICATION_CREDENTIALS
              value: "/path/to/secret/gke-endpoints-deployments-access.json"
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: DEPLOYMENT_NAME
              value: "endpoints"
          resources:
            requests:
              memory: "5Gi"
              cpu: 2
            limits:
              memory: "5Gi"
              cpu: 2
          volumeMounts:
            - name: endpoints-gcp-access
              mountPath: /path/to/secret
              readOnly: true
          lifecycle:
            preStop:
              exec:
                # SIGTERM triggers a quick exit; gracefully terminate instead
                command: ["/bin/sh","-c","sleep 3; /usr/sbin/nginx -s quit; sleep 57"]
        # [START proxy_container]
        - name: cloudsql-proxy
          image: gcr.io/cloudsql-docker/gce-proxy:1.16
          command: ["/cloud_sql_proxy",
                    "-instances=<PROJECT_ID>:<ZONE>:prod-db=tcp:3306,<PROJECT_ID>:<ZONE>:prod-db-read-replica=tcp:3307",
                    "-credential_file=/path/to/secret/gke-endpoints-deployments-access.json"]
          # [START cloudsql_security_context]
          securityContext:
            runAsUser: 2  # non-root user
            allowPrivilegeEscalation: false
          # [END cloudsql_security_context]
          resources:
            requests:
              memory: "50Mi"
              cpu: 0.1
            limits:
              memory: "50Mi"
              cpu: 0.1
          volumeMounts:
            - name: endpoints-gcp-access
              mountPath: /path/to/secret
              readOnly: true
        # [END proxy_container]
        # [START nginx-prometheus-exporter container]
        - name: nginx-prometheus-exporter
          image: nginx/nginx-prometheus-exporter:0.7.0
          ports:
            - containerPort: 9113
              protocol: TCP
          env:
            - name: CONST_LABELS
              value: "app=endpoints"
          resources:
            requests:
              memory: "50Mi"
              cpu: 0.1
            limits:
              memory: "50Mi"
              cpu: 0.1
        # [END nginx-prometheus-exporter container]
      tolerations:
        - key: "qosclass"
          operator: "Equal"
          value: "guaranteed"
          effect: "NoSchedule"
      nodeSelector:
        qosclass: guaranteed
      # [START volumes]
      volumes:
        - name: endpoints-gcp-access
          secret:
            secretName: endpoints-gcp-access
      # [END volumes]
---
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
  name: endpoints-backendconfig
spec:
  timeoutSec: 60
  connectionDraining:
    drainingTimeoutSec: 60
---
apiVersion: v1
kind: Service
metadata:
  name: endpoints
  labels:
    app: endpoints
  annotations:
    cloud.google.com/neg: '{"ingress": true}' # Creates a NEG after an Ingress is created
    beta.cloud.google.com/backend-config: '{"ports": {"80":"endpoints-backendconfig"}}'
spec:
  type: NodePort
  selector:
    run: endpoints
  ports:
    - name: endpoints-nginx
      port: 80
      protocol: TCP
      targetPort: 8080
    - name: endpoints-metrics
      port: 81
      protocol: TCP
      targetPort: 9113
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: endpoints-autoscaler
spec:
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      targetAverageUtilization: 40
  - type: External
    external:
      metricName: external.googleapis.com|prometheus|nginx_http_requests_total
      metricSelector:
        matchLabels:
          metric.labels.app: endpoints
      targetAverageValue: "5"
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: endpoints
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: endpoints-nginx-monitor
  namespace: monitoring
  labels:
    app: endpoints-nginx-monitor
    chart: prometheus-operator-8.13.7
    release: prom-operator
    heritage: Tiller
spec:
  selector:
    matchLabels:
      app: endpoints
  namespaceSelector:
    any: true
  endpoints:
  - port: endpoints-metrics
    path: "/metrics"

And here is the dockerfile for the only custom container image used in the deployment:

# Dockerfile extending the generic PHP image with application files for a
# single application.
FROM gcr.io/google-appengine/php:latest

# The Docker image will configure the document root according to this
# environment variable.
ENV DOCUMENT_ROOT /app

RUN /bin/bash /stackdriver-files/enable_stackdriver_integration.sh
like image 786
Muhammad Anas Avatar asked Jun 28 '20 20:06

Muhammad Anas


People also ask

What causes CPU throttling in Kubernetes?

CPU throttling occurs when you configure a CPU limit on a container, which can invertedly slow your applications response-time. Even if you have more than enough resources on your underlying node, you container workload will still be throttled because it was not configured properly.

Should you use CPU limits Kubernetes?

CPU limits on Kubernetes are an antipattern Many people think you need CPU limits on Kubernetes but this isn't true. In most cases, Kubernetes CPU limits do more harm than help.

What is CPU limit in Kubernetes?

Each container has a limit of 0.5 CPU and 128MiB of memory.

What happens if pod exceeds CPU limit?

If a container attempts to exceed the specified limit, the system will throttle the container.

How does Kubernetes limit CPU and memory?

Kubernetes uses kernel throttling to implement CPU limit. If an application goes above the limit, it gets throttled (aka fewer CPU cycles). Memory requests and limits, on the other hand, are implemented differently, and it’s easier to detect.

What is CPU throttling in Kubernetes K8s?

If you configure a CPU limit in K8s it will set period and quota. If a process running in a container reaches the limit it is preempted and has to wait for the next period. It is throttled. So this is the effect, which you are experiencing.

How much CPU time a container can use without throttling?

Let’s say you have configured 2 core as CPU limit; the k8s will translate this to 200ms. That means the container can use a maximum of 200ms CPU time without getting throttled. And here starts all misunderstanding.

What is the upper bound of CPU throttling?

So far as I can comprehend, the theoretical upper bound of throttling is n * (100ms) - limit where n is the number of vCPUs and limit is how many milliseconds of CPU you are allotted in a 100ms window (calculated earlier by cpuLimit * 100ms ).


1 Answers

I don't know what that red line is, so I'll skip that one. Would be nice though to know what do you expected to happen with CPU throttling case.

So, about your CPU usage and throttling, there is no indication that anything goes wrong. CPU throttling happens with any modern systems when there is lots of CPU available. So, it will slow down the clock, and will start running slower (e.g. a 2.3GHz machine switches to 2.0GHz). This is the reason you can't set CPU limit based on percentage.

So, from your graphs, what I speculate to see is a CPU clock going down, and naturally a percentage going up; as expected. Nothing weird.

like image 57
suren Avatar answered Oct 20 '22 14:10

suren