Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alert when docker container pod is in Error or CarshLoopBackOff kubernetes

I have my kubernetes cluster setup on AWS where I am trying to monitor several pods, using cAdvisor + Prometheus + Alert manager. What I want to do it launch an email alert (with service/container name) if a container/pod goes down or stuck in Error or CarshLoopBackOff state or stcuk in anyother state apart from running.

like image 875
shiv455 Avatar asked Mar 25 '18 03:03

shiv455


1 Answers

Prometheus collects a wide range of metrics. As an example, you can use a metric kube_pod_container_status_restarts_total for monitoring restarts, which will reflect your problem.

It containing tags which you can use in the alert:

  • container=container-name
  • namespace=pod-namespace
  • pod=pod-name

So, everything you need is to configure your alertmanager.yaml config by adding correct SMTP settings, receiver and rules like that:

global:
  # The smarthost and SMTP sender used for mail notifications.
  smtp_smarthost: 'localhost:25'
  smtp_from: '[email protected]'
  smtp_auth_username: 'alertmanager'
  smtp_auth_password: 'password'

receivers:
- name: 'team-X-mails'
  email_configs:
  - to: '[email protected]'

# Only one default receiver
route:
  receiver: team-X-mails

# Example group with one alert
groups:
- name: example-alert
  rules:
    # Alert about restarts
  - alert: RestartAlerts
    expr: count(kube_pod_container_status_restarts_total) by (pod-name) > 5
    for: 10m
    annotations:
      summary: "More than 5 restarts in pod {{ $labels.pod-name }}"
      description: "{{ $labels.container-name }} restarted (current value: {{ $value }}s) times in pod {{ $labels.pod-namespace }}/{{ $labels.pod-name }}"
like image 73
Anton Kostenko Avatar answered Sep 23 '22 08:09

Anton Kostenko