Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How dangerous are high-cardinality labels in Prometheus?

Tags:

prometheus

I'm considering exporting some metrics to Prometheus, and I'm getting nervous about what I'm planning to do.

My system consists of a workflow engine, and I'd like to track some metrics for each step in the workflow. This seems reasonable, with a gauge metric called wfengine_step_duration_seconds. My issue is that there are many thousands of steps across all my workflows.

According to the documentation here, I'm not supposed to programmatically generate any part of the name. That precludes, then, the use of names such as wfengine_step1_duration_seconds and wfengine_step2_duration_seconds, because the step names are programmatic (they change from time to time).

The solution, then is a label for the step names. This also presents a problem, though, because the documentation here and here cautions quite strongly against using labels with high cardinality. Specifically, they recommend keeping "the cardinality of your metrics below 10", and for cardinality over 100, "investigate alternate solutions such as reducing the number of dimensions or moving the analysis away from monitoring".

I'm looking at a number of label values in the low thousands (1,000 to 10,000). Given that the number of metrics otherwise won't be extremely large, is this an appropriate usage of Prometheus, or should I limit myself to more generic metrics, such as a single aggregated step duration instead of individual duration for each step?

like image 740
Mark Avatar asked Sep 22 '17 21:09

Mark


People also ask

What is high cardinality In Prometheus?

The basic definition of cardinality is the number of elements in a given set. In the world of Prometheus and observability, label cardinality is extremely important because it impacts the performance and resource usage of your monitoring system.

What is label cardinality?

Label cardinality is the average number of labels assigned to a document. Source publication. Large-Scale Multi-label Text Classification — Revisiting Neural Networks.

What are high cardinality metrics?

High cardinality refers to a column that can have many possible values. For an online shopping system, fields like userId , shoppingCartId , and orderId are often high-cardinality columns that can take take hundreds of thousands of distinct values. Similarly, requestId might be in the millions.

What is downsampling Prometheus?

One option is downsampling. With standalone Prometheus, this typically means reducing the sampling rate of emitted metrics and thus the fidelity of a set of metrics. With Chronosphere, downsampling means choosing how and how long to store metrics.


1 Answers

The guideline of staying under 100 cardinality for your biggest metrics presumes that you have 1000 replicas of your service, as that's a reasonably safe upper bound. If you know that everyone using this code will always have a lower number of replicas, then there's scope to have a higher cardinality in instrumentation.

Saying that, thousands of labels is still something to be careful with. If it's already tens of thousands, how long before it's hundreds of thousands? Long term you'll likely have to move this data to logs given the cardinality, so you may wish to do so now.

like image 161
brian-brazil Avatar answered Sep 29 '22 22:09

brian-brazil