Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

K8s: why is there no easy way to get notifications if a pod becomes unhealthy and is restarted?

Why is there no easy way to get notifications if a pod becomes unhealthy and is restarted?

To me, it suggests I shouldn't care that a pod was restarted, but why not?

like image 458
nhooyr Avatar asked Mar 05 '19 22:03

nhooyr


3 Answers

If a pod/container crashes for some reason Kubernetes is supposed to provide that reliability/availability that it will start somewhere else in the cluster. Having said that you probably want warnings and alerts (if you the pod goes into a Crashloopbackoff.

Although you can write your own tool you can watch for specific events in your cluster and then you alert/warn on those using some of these tools:

  • kubewatch
  • kube-slack (Slack tool).
  • The most popular K8s monitoring tool: prometheus.
  • A paid tool like Sysdig.
like image 116
Rico Avatar answered Oct 23 '22 13:10

Rico


Think of Pods as ephemeral entities - they can live in different nodes, they can crash, they can start again...

Kubernetes is responsible to handle the lifecycle of a pod. Your job is to tell it where to run (affinity rules) and how to tell if a pod if healthy.

There are many ways of monitoring pod crashes. For example - prometheus has a great integation with Kubernetes.

like image 28
Amityo Avatar answered Oct 23 '22 13:10

Amityo


I wrote an open source tool to do this called Robusta. (Yes, it's named after the coffee.)

You can send the notifications to multiple destinations - here is a screenshot for Slack.

crashing pod

Under the hood we're using our own fork of Kubewatch to track APIServer events, but we're adding on multiple features like fetching logs.

You define in YAML the triggers and the actions:

- triggers:
  - on_pod_update: {}
  actions:
  - restart_loop_reporter:
      restart_reason: CrashLoopBackOff
  - image_pull_backoff_reporter:
      rate_limit: 3600

Each action is defined with a Python function, but you typically don't need to write them yourself because we have 50+ builtin actions. (See some examples, here.)

like image 22
Natan Yellin Avatar answered Oct 23 '22 11:10

Natan Yellin