Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use Prometheus operator with DB volume for k8s

We are trying to monitor K8S with Grafana and Prometheus Operator. Most of the metrics are working as expected and I was able to see the dashboard with the right value, our system contain 10 nodes with overall 500 pods. Now when I restarted Prometheus all the data was deleted. I want it to be stored for two week.

My question is, How can I define to Prometheus volume to keep the data for two weeks or 100GB DB.

I found the following (we use Prometheus operator):

https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/storage.md

This is the config of the Prometheus Operator

apiVersion: apps/v1beta2
kind: Deployment
metadata:
  labels:
    k8s-app: prometheus-operator
  name: prometheus-operator
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: prometheus-operator
  template:
    metadata:
      labels:
        k8s-app: prometheus-operator
    spec:
      containers:
      - args:
        - --kubelet-service=kube-system/kubelet
        - --logtostderr=true
        - --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
        - --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
        image: quay.io/coreos/prometheus-operator:v0.29.0
        name: prometheus-operator
        ports:
        - containerPort: 8080
          name: http

This is the config of the Prometheus

    apiVersion: monitoring.coreos.com/v1
    kind: Prometheus
    metadata:
      name: prometheus
      namespace: monitoring
      labels: 
        prometheus: prometheus
    spec:
      replica: 2
      serviceAccountName: prometheus
      serviceMonitorNamespaceSelector: {}
      serviceMonitorSelector:
        matchLabels:
          role: observeable
      tolerations:
      - key: "WorkGroup"
        operator: "Equal"
        value: "operator"
        effect: "NoSchedule"
      - key: "WorkGroup"
        operator: "Equal"
        value: "operator"
        effect: "NoExecute"
      resources:
        limits:
          cpu: 8000m
          memory: 24000Mi
        requests:
          cpu: 6000m
          memory: 6000Mi
     storage:
       volumeClaimTemplate:
         spec:
        selector:
          matchLabels:
            app: prometheus
        resources:
          requests:
            storage: 100Gi

https://github.com/coreos/prometheus-operator/blob/master/Documentation/user-guides/storage.md

We have file system (nfs), and the above storage config doesn't works, my questions are:

  1. What I miss here is how to config the volume, server , path in the following its under the nfs section? Where should I find this /path/to/prom/db? How can I refer to it? Should I create it somehow, or just provide the path?

We have NFS configured in our system.

  1. How to combine it to Prometheus?

As I don't have deep knowledge in pvc and pv, I've created the following (not sure regard those values, what is my server and what path should I provide)...

server: myServer
path: "/path/to/prom/db"

What should I put there and how I make my Prometheus (i.e. the config I have provided in the question) to use it?

apiVersion: v1
kind: PersistentVolume
metadata:
  name: prometheus
  namespace: monitoring
  labels:
    app: prometheus
    prometheus: prometheus
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce # required
  nfs:
    server: myServer
    path: "/path/to/prom/db"

If there any other persistence volume other than nfs which I can use for my use-case? Please advice how.

like image 731
JDC Avatar asked Mar 11 '19 16:03

JDC


People also ask

What is the difference between Prometheus and Prometheus operator?

and management of Prometheus instances. The difference between stable/prometheus and stable/prometheus-operator is that Operator has built-in Grafana with a set of ready for use dashboards and set of ServiceMonitors to collect metrics from a cluster's services such as the CoreDNS, API Server, Scheduler, etc.

What is Prometheus operator in Kubernetes?

The Prometheus Operator provides Kubernetes native deployment and management of Prometheus and related monitoring components. The purpose of this project is to simplify and automate the configuration of a Prometheus based monitoring stack for Kubernetes clusters.

How does Prometheus operator work?

Prometheus Operator uses CRD (Custom Resource Definitions) to generate configuration files and identify Prometheus resources. The operator also monitors resources from ServiceMonitors, PodMonitors and ConfigMaps, generating prometheus. yaml based on them.


3 Answers

I started working with the operator chart recently ,

And managed to add persistency without defining pv and pvc.

On the new chart configuration adding persistency is much easier than you describe just edit the file /helm/vector-chart/prometheus-operator-chart/values.yaml under prometheus.prometheusSpec:

storageSpec:
  volumeClaimTemplate:
    spec:
      storageClassName: prometheus
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
    selector: {}

And add this /helm/vector-chart/prometheus-operator-chart/templates/prometheus/storageClass.yaml:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: prometheus
provisioner: kubernetes.io/aws-ebs
reclaimPolicy: Retain
parameters:
  type: gp2
  zones: "ap-southeast-2a, ap-southeast-2b, ap-southeast-2c"
  encrypted: "true"

This will automatically create you both pv and a pvc which will create an ebs in aws which will store all your data inside.

like image 186
Shahar Hamuzim Rajuan Avatar answered Sep 29 '22 06:09

Shahar Hamuzim Rajuan


you must have to use persistent volume and volume claim (PV & PVC) for persist data. You can refer "https://kubernetes.io/docs/concepts/storage/persistent-volumes/" must see carefully provisioning, reclaim policy, access mode, storage type in above url.

like image 39
hk' Avatar answered Sep 29 '22 06:09

hk'


To determine when to remove old data, use this switch --storage.tsdb.retention

e.g. --storage.tsdb.retention='7d' (by default, Prometheus keeps data for 15 days).

To completely remove the data use this API call:

$ curl -X POST -g 'http://<your_host>:9090/api/v1/admin/tsdb/<your_index>'

EDIT

Kubernetes snippet sample

...
 spec:
      containers:
      - name: prometheus
        image: docker.io/prom/prometheus:v2.0.0
        args:
          - '--config.file=/etc/prometheus/prometheus.yml'
          - '--storage.tsdb.retention=7d'
        ports:
        - name: web
containerPort: 9090
...
like image 37
matson kepson Avatar answered Sep 29 '22 05:09

matson kepson