Are there some broker metrics we can use to monitor Kafka broker if acknowledgment lag is very high in the producer side.
We are using datadog to monitor producer and Kafka broker side. It can be seen that the producer ack lag is more than 10 secs. However, on the broker side, I feel like only using message.in.rate
and kafka.net.bytes_in.rate
are not very efficient. It would be better we can have some LAG metrics in the broker side to indicate the broker is fully loaded to acknowledge back the producer.
Also, we only use kafka.acks = 1
for partition leader.
I wonder does anyone has some experience about it and any advice is welcome. :) Thanks in advance.
I'm guessing you're talking about "metrics" instead of matrix!
On the Producer, you have kafka.producer:type=producer-metrics,client-id="{client-id}"
. That metric has 2 interesting attributes:
request-latency-avg: The average request latency in ms
request-latency-max: The maximum request latency in ms
On the broker side, there are a few metrics you want to check to investigate your issue:
kafka.network:type=RequestMetrics,name=MessageConversionsTimeMs,request=Produce
Request total time: Total time Kafka took to process the request. kafka.network:type=RequestMetrics,name=TotalTimeMs,request=Produce
In case this is high, you can check the break down metrics:
kafka.network:type=RequestMetrics,name=RequestQueueTimeMs,request=Produce
kafka.network:type=RequestMetrics,name=LocalTimeMs,request=Produce
kafka.network:type=RequestMetrics,name=ResponseQueueTimeMs,request={Produce|FetchConsumer|FetchFollower}
kafka.network:type=RequestMetrics,name=ResponseSendTimeMs,request=Produce
These are all listed in the metrics recommended to monitor list in the Kafka documentation: http://kafka.apache.org/documentation/#monitoring
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With