Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Kafka handle a consumer which is running slower than other consumers?

Let's say I have 20 partitions and five workers. Each partition is assigned a worker. However, one worker is running slower than the other machines. It's still processing (that is, not slow consumer described here), but at 60% rate of the other machines. This could be because the worker is running on a slower VM on AWS EC2, a broken disk or CPU or whatnot. Does Kafka handle rebalancing gracefully somehow to give the slow worker fewer partitions?

like image 884
Ztyx Avatar asked Dec 04 '15 13:12

Ztyx


People also ask

How can Kafka consumer improve performance?

Improving throughput by increasing the minimum amount of data fetched in a request. Use the fetch.max.wait.ms and fetch. min. bytes configuration properties to set thresholds that control the number of requests from your consumer.

What happens if Kafka consumer is down?

If the consumer crashes or is shut down, its partitions will be re-assigned to another member, which will begin consumption from the last committed offset of each partition. If the consumer crashes before any offset has been committed, then the consumer which takes over its partitions will use the reset policy.

How does Kafka handle multiple consumers?

Kafka consumers are typically part of a consumer group . When multiple consumers are subscribed to a topic and belong to the same consumer group, each consumer in the group will receive messages from a different subset of the partitions in the topic.


1 Answers

Kafka doesn't really concern itself with how fast messages are being consumed. It doesn't even get involved with how many consumers there are or how many times each message is read. Kafka just commits messages to partitions and ages them out at the configured time.

It's the responsibility of the group of consumers to make sure that the messages are being read evenly and in a timely fashion. In your case, you have two problems: The reading of one set of partitions lags and then then processing of the messages from those partitions lags.

For the actual consumption of messages from the topic, you'll have to use the Kafka metadata API's to track the relative loads each consumer faces, whether by skewed partitioning or because the consumers are running at different speeds. You either have to re-allocate partitions to consumers to give the slow consumers less work or randomly re-assign consumers to partitions in the hope of eventually evening out the workload over time.

To better balance the processing of messages, you should factor out the reading of the messages from the processing of the messages - something like the Storm streaming model. You still have to programmatically monitor the backlogs into the processing logic, but you'd have the ability to move work to faster nodes in order to balance the work.

like image 80
Chris Gerken Avatar answered Sep 19 '22 17:09

Chris Gerken