Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubernetes requests not balanced

We've just had an increase in traffic to our kubernetes cluster and I've noticed that of our 6 application pods, 2 of them are seemingly not used very much. kubectl top pods returns the following

CPU usage

You can see of the 6 pods, 4 of them are using more than 50% of the CPU (2 vCPU nodes), but two of them aren't really doing much at all.

Our cluster is setup on AWS, using the ALB ingress controller. The load balancer is configured to use the Least outstanding requests rather than Round robin in an attempt to balance things out a bit more, but we're still seeing this imbalance.

Is there any way of determining why this is happening, or if indeed it actually is a problem? I'm hoping it's normal behaviour rather than an issue but I'd rather investigate it.

App deployment config

This is the configuration of the main application pods. Nothing fancy really

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "app.fullname" . }}
  labels:
    {{- include "app.labels" . | nindent 4 }}
    app.kubernetes.io/component: web
spec:
  replicas: {{ .Values.app.replicaCount }}
  selector:
    matchLabels:
      app: {{ include "app.fullname" . }}-web

  template:
    metadata:
      annotations:
        checksum/config: {{ include (print $.Template.BasePath "/config_maps/app-env-vars.yaml") . | sha256sum }}
      {{- with .Values.podAnnotations }}
        {{- toYaml . | nindent 10 }}
      {{- end }}
      labels:
        {{- include "app.selectorLabels" . | nindent 8 }}
        app.kubernetes.io/component: web
        app: {{ include "app.fullname" . }}-web
    spec:
      serviceAccountName: {{ .Values.serviceAccount.web }}
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - {{ include "app.fullname" . }}-web
              topologyKey: failure-domain.beta.kubernetes.io/zone
      containers:
        - name: {{ .Values.image.name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          command:
            - bundle
          args: ["exec", "puma", "-p", "{{ .Values.app.containerPort }}"]
          ports:
            - name: http
              containerPort: {{ .Values.app.containerPort }}
              protocol: TCP
          readinessProbe:
            httpGet:
              path: /healthcheck
              port: {{ .Values.app.containerPort }}
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 5
            timeoutSeconds: 5
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
          envFrom:
          - configMapRef:
              name: {{ include "app.fullname" . }}-cm-env-vars

          - secretRef:
              name: {{ include "app.fullname" . }}-secret-rails-master-key

          - secretRef:
              name: {{ include "app.fullname" . }}-secret-db-credentials

          - secretRef:
              name: {{ include "app.fullname" . }}-secret-db-url-app

          - secretRef:
              name: {{ include "app.fullname" . }}-secret-redis-credentials
like image 909
PaReeOhNos Avatar asked Oct 20 '25 17:10

PaReeOhNos


1 Answers

That's a known issue with Kubernetes.
In short, Kubernetes doesn't load balance long-lived TCP connections.
This excellent article covers it in details.

The load distribution you service is showing complies exactly with the case.

like image 91
Olesya Bolobova Avatar answered Oct 23 '25 06:10

Olesya Bolobova



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!