Why is Kafka pull-based instead of push-based?

Question

Why is Kafka pull-based instead of push-based? I agree Kafka gives high throughput as I had experienced it, but I don't see how Kafka throughput would go down if it were to pushed based. Any ideas on how push-based can degrade performance?

kanishka vatsa · Accepted Answer

Scalability was the major driving factor when we design such systems (pull vs push). Kafka is very scalable. One of the key benefits of Kafka is that it is very easy to add large number of consumers without affecting performance and without down time.

Kafka can handle events at 100k+ per second rate coming from producers. Because Kafka consumers pull data from the topic, different consumers can consume the messages at different pace. Kafka also supports different consumption models. You can have one consumer processing the messages at real-time and another consumer processing the messages in batch mode.

The other reason could be that Kafka was designed not only for single consumers like Hadoop. Different consumers can have diverse needs and capabilities.

Pull-based systems have some deficiencies like resources wasting due to polling regularly. Kafka supports a 'long polling' waiting mode until real data comes through to alleviate this drawback.

Why is Kafka pull-based instead of push-based?

Tags:

apache-kafka

user1870400

1 Answers

kanishka vatsa

Recent Activity

Donate For Us

Why is Kafka pull-based instead of push-based?

Tags:

apache-kafka

user1870400

1 Answers

kanishka vatsa

Related questions

Recent Activity

Donate For Us