Logo Questions Linux Laravel Mysql Ubuntu Git Menu

How can I ensure graceful scaling in kubernetes?

As part of scaling pods in kubernetes I want to ensure I gracefully serve my http connections before shutting down. To that extent I have implemented this code in go:

package main

import (


func main() {

    shutdown := make(chan int)

    //create a notification channel to shutdown
    sigChan := make(chan os.Signal, 1)

    //start the http server
    http.HandleFunc("/", hello)
    server := manners.NewWithServer(&http.Server{Addr: ":80", Handler: nil})
    go func() {
        shutdown <- 1

    //register for interupt (Ctrl+C) and SIGTERM (docker)
    signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
    go func() {
        fmt.Println("Shutting down...")


func hello(w http.ResponseWriter, r *http.Request) {
    // time.Sleep(3000 * time.Millisecond)
    io.WriteString(w, "Hello world!")

This looks out for the docker SIGTERM and gracefully shuts down after existing requests have been served. When I run this container in kubernetes with 10 instances I can scale up and down without incident, as long as I don't scale down to a single instance. When I scale to a single instance I see a short set of http errors, then all looks fine again.

I find it strange as in scaling I would assume the proxy is updated first, then containers are shut down and the code above would allow requests to be served out.

In my current setup I am running 2 nodes, maybe the issue is when scaling drops below the number of nodes and there is some sort of timing issue with etcd updates? Any insight into what is going on here would be really useful

like image 844
bite-code Avatar asked Jul 22 '15 21:07


People also ask

How does Kubernetes handle scaling?

In Kubernetes, a HorizontalPodAutoscaler automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand. Horizontal scaling means that the response to increased load is to deploy more Pods.

How is Kubernetes used for scalability?

Improved Scalability Kubernetes allows users to horizontally scale the total containers used based on the application requirements, which may change over time. It's easy to change the number via the command line. You can also use the Horizontal Pod Autoscaler to do this.

1 Answers

You should use a readiness check (http://kubernetes.io/v1.0/docs/user-guide/production-pods.html#liveness-and-readiness-probes-aka-health-checks)

that transitions the Pod to "not ready" after you receive a SIGTERM

Once that happens, the service will remove the Pod from serving, prior to the delete.

(without a readiness check the Service simply doesn't know that the pod doesn't exist, until it is actually deleted)

You may also want to use a PreStop hook that sets readiness to false, and then drains all existing requests. PreStop hooks are called synchronously prior to a Pod being deleted and they are described here:


like image 76
brendan Avatar answered Nov 15 '22 03:11
