I have an API that that processes collections. The execution time of this API is related to the collection size (the larger the collection, the more it will take).
I am researching how can I do this with prometheus but am unsure whether I am doing things correctly (documentation is a bit lacking in this area).
the first thing I did is define a Summary metric to measure execution time of the API. I am using the canonical rate(sum)/rate(count) as explained here.
Now, since I know that the latency may be affected by the size of the input, I also want to overlay the request size on the avg execution time. Since I dont want to measure each possible size, I figured I'd use a histogram. Like so:
Histogram histogram = Histogram.build().buckets(10, 30, 50)
.name("BULK_REQUEST_SIZE")
.help("histogram of bulk sizes to correlate with duration")
.labelNames("method", "entity")
.register();
Note: the term 'size' does not relate to the size in bytes but to the length of the collection that needs to be processed. 2 items, 5 items, 50 items...
and in the execution I do (simplified):
@PUT
void process(Collection<Entity> entitiesToProcess, string entityName){
Timer t = summary.labels("PUT_BULK", entityName).startTimer()
// process...
t.observeDuration();
histogram.labels("PUT_BULK", entityName).observe(entitiesToProcess.size())
}
Question:
Your code is correct (though bulk_request_size_bytes
would be a better metric name).
The problem is likely that you've suboptimal buckets, as 10, 30 and 50 bytes are pretty small for most request sizes. I'd try larger bucket sizes that cover more typical values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With