In Kubernetes, how can I scale a Deployment to zero when idle

Tags:

I'm running a fairly resource-intensive service on a Kubernetes cluster to support CI activities. Only a single replica is needed, but it uses a lot of resources (16 cpu), and it's only needed during work hours generally (weekdays, 8am-6pm roughly). My cluster runs in a cloud and is setup with instance autoscaling, so if this service is scaled to zero, that instance can be terminated.

The service is third-party code that cannot be modified (well, not easily). It's a fairly typical HTTP service other than that its work is fairly CPU intensive.

What options exist to automatically scale this Deployment down to zero when idle?

I'd rather not setup a schedule to scale it up/down during working hours because occasionally CI activities are performed outside of the normal hours. I'd like the scaling to be dynamic (for example, scale to zero when idle for >30 minutes, or scale to one when an incoming connection arrives).

533

asked May 04 '20 16:05

Patrick

Video Answer

4 Answers

I ended up implementing a custom solution: https://github.com/greenkeytech/zero-pod-autoscaler

Compared to Knative, it's more of a "toy" project, fairly small, and has no dependency on istio. It's been working well for my use case, though I do not recommend others use it without being willing to adopt the code as your own.

100

answered Oct 20 '22 09:10

Patrick

Actually Kubernetes supports the scaling to zero only by means of an API call, since the Horizontal Pod Autoscaler does support scaling down to 1 replica only.

Anyway there are a few Operator which allow you to overtake that limitation by intercepting the requests coming to your pods or by inspecting some metrics.

You can take a look at Knative or Keda. They enable your application to be serverless and they do so in different ways.

Knative, by means of Istio intercept the requests and if there's an active pod serving them, it redirects the incoming request to that one, otherwise it trigger a scaling.

By contrast, Keda best fits event-driven architecture, because it is able to inspect predefined metrics, such as lag, queue lenght or custom metrics (collected from Prometheus, for example) and trigger the scaling.

Both support scale to zero in case predefined conditions are met in a equally predefined window.

Hope it helped.

answered Oct 20 '22 07:10

amenic

There are a few ways this can be achieved, possibly the most "native" way is using Knative with Istio. Kubernetes by default allows you to scale to zero, however you need something that can broker the scale-up events based on an "input event", essentially something that supports an event driven architecture.

You can take a look at the offcial documents here: https://knative.dev/docs/serving/configuring-autoscaling/

answered Oct 20 '22 08:10

Tom.Bastianello

The horizontal pod autoscaler currently doesn’t allow setting the minReplicas field to 0, so the autoscaler will never scale down to zero, even if the pods aren’t doing anything. Allowing the number of pods to be scaled down to zero can dramatically increase the utilization of your hardware.

When you run services that get requests only once every few hours or even days, it doesn’t make sense to have them running all the time, eating up resources that could be used by other pods.

But you still want to have those services available immediately when a client request comes in.

This is known as idling and un-idling. It allows pods that provide a certain service to be scaled down to zero. When a new request comes in, the request is blocked until the pod is brought up and then the request is finally forwarded to the pod.

Kubernetes currently doesn’t provide this feature yet, but it will eventually.

answered Oct 20 '22 08:10

omricoco

Related questions
                            
                                no persistent volumes available for this claim and no storage class is set
                            
                                helm not creating the resources
                            
                                How do you add scrape targets to a Prometheus server that was installed with Kubernetes-Helm?
                            
                                How to access a service in a kubernetes cluster using the service name .
                            
                                Exposing a service in Kubernetes using nginx reverse proxy
                            
                                Kubernetes- error uploading crisocket: timed out waiting for the condition
                            
                                Kubernetes RBAC rules for PersistentVolume
                            
                                Grafana HTTP Error Bad Gateway and Templating init failed errors
                            
                                What is, and what use cases have the dot "." in helm charts?
                            
                                Kubernetes: How to get disk / cpu metrics of a node
                            
                                Creating ssh secrets key file in kubernetes
                            
                                How do I make an HTTPS call in a Busybox Docker container running Go?
                            
                                What is hcp-tunnelfront?
                            
                                Use of Skaffold using Minikube without registry
                            
                                Running a command on all kubernetes pods of a service
                            
                                How to change permission of mapped volume in kubernetes/Docker
                            
                                http -> https redirect in Google Kubernetes Engine
                            
                                Minikube error - " unknown field "app" in io.k8s"
                            
                                Is there a way to get ordinal index of a pod with in kubernetes statefulset configuration file?
                            
                                What is the purpose of kubectl proxy?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In Kubernetes, how can I scale a Deployment to zero when idle

Tags:

kubernetes

horizontal-pod-autoscaling

Patrick

People also ask

Video Answer

4 Answers

Patrick

amenic

Tom.Bastianello

omricoco

Recent Activity

Donate For Us