Our aim is to horizontally scale a .NET Core 2.0 Web API using Kubernetes. The Web API application will be served by Kestrel.
It looks like we can gracefully handle the termination of pods by configuring Kestrel's shutdown timeout so now we are looking into how to probe the application to determine readiness and liveness.
Would it be enough to simply probe the Web API with a HTTP request? If so, would it be a good idea to create a new healthcheck controller to handle these probing requests or would it make more sense to probe an actual endpoint that would be consumed in normal use?
What should we consider when differentiating between the liveness and readiness probes?
There are three types of probes: HTTP, Command, and TCP. You can use any of them for liveness and readiness checks.
Kubernetes gives you two types of health checks performed by the kubelet. They are: Startup Probe. Liveness Probe.
There is no separate endpoint for readiness probes, but we can access events using the kubectl describe pods <POD_NAME> command, for example, to get the current status. Use kubectl get pods command to see the pods' status. Pods and their status and ready states will be displayed, our pod is running as expected.
What if I don't specify a liveness probe? If you don't specify a liveness probe, then OpenShift will decide whether to restart your container based on the status of the container's PID 1 process. The PID 1 process is the parent process of all other processes that run inside the container.
I would recommend to perform health checks through separate endpoints. In general, there are a number of good reasons for doing so, like:
As you pointed out, a good way to avoid any of the above could be setting up a separate Controller to handle health checks.
As an alternative option, there is a standard library available in ASP.NET Core for enabling Health Checks on your web service: at the time of writing this answer, it is not officially part of ASP.NET Core and no NuGet packages are available yet, but there is a plan for this to happen on future releases. For now, you can easily pull the code from the Official Repository and include it in your solution as explained in the Microsoft documentation. This is currently planned to be included in ASP.NET Core 2.2 as described in the ASP.NET Core 2.2 Roadmap.
I personally find it very elegant, as you will configure everything through the Startup.cs
and Program.cs
and won't need to explicitly create a new endpoint as the library already handles that for you.
I have been using it in a few projects and I would definitely recommend it. The repository includes an example specific for ASP.NET Core projects you can use to get quickly up to speed.
In Kubernetes, you may then setup liveness and readiness probes through HTTP: as explained in the Kubernetes documentation, while the setup for both is almost identical, Kubernetes takes different actions depending on the probe:
Liveness probe from Kubernetes documentation:
Many applications running for long periods of time eventually transition to broken states, and cannot recover except by being restarted. Kubernetes provides liveness probes to detect and remedy such situations.
Readiness probe from Kubernetes documentation:
Sometimes, applications are temporarily unable to serve traffic. For example, an application might need to load large data or configuration files during startup. In such cases, you don’t want to kill the application, but you don’t want to send it requests either. Kubernetes provides readiness probes to detect and mitigate these situations. A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services.
So, while an unhealthy response to a liveness probe will cause the Pod (and so, the application) to be killed, an unhealthy response to a readiness probe will simply cause the Pod to receive no traffic until it gets back to a healthy status.
What to consider when differentiating liveness and readiness probes?
For liveness probe: I would recommend to define what makes your application healthy, i.e. minimum requirements for user consumption, and implement health checks based on that. This typically involves external resources or applications running as separate processes, e.g. databases, web services, etc. You may define health checks by using ASP.NET Core Health Checks library or manually with a separate Controller.
For readiness probe: You simply want to load your service to verify it actually responds in time and so allows Kubernetes to balance traffic accordingly. Trivially (and in most cases as suggested by Lukas in another answer), you may use the same exact endpoint you would use for liveness but setting up different timeouts, but this then really depends on your needs and requirements.
What should we consider when differentiating between the liveness and readiness probes
My recommendation would be to provide a /health
endpoint in your application separate from your application endpoint. This is useful if you want to block your consumers from calling your internal health endpoint. Then you can configure Kubernetes to query your HTTP /health
endpoint like in the example below.
apiVersion: v1
kind: Pod
metadata:
name: goproxy
spec:
containers:
- name: goproxy
image: k8s.gcr.io/goproxy:0.1
ports:
- name: http
containerPort: 8080
readinessProbe:
httpGet:
port: http
path: /health
initialDelaySeconds: 60
livenessProbe:
httpGet:
port: http
path: /health
Inside your /health
endpoint you should check the internal state of your application and return a status code of either 200
if everything is OK or 503
if your application is having issues. Keep in mind that health checks are performed usually every 15 seconds for every instance and if you are performing expensive operations to determining your application state you might slow down your application.
What should we consider when differentiating between the liveness and readiness probes
Usually the only difference between liveness and readiness probes are the timeouts in each probe. Maybe your application needs 60 seconds to start then you would need to set the initial timeout of your readiness probe to 60 while keeping the default liveness timeout.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With