I have been using the python-kaka module to consume from a kafka broker. I want to consume from the same topic with 'x' number of partitions in parallel. The documentation has this :
# Use multiple consumers in parallel w/ 0.9 kafka brokers
# typically you would run each on a different server / process / CPU
consumer1 = KafkaConsumer('my-topic',
group_id='my-group',
bootstrap_servers='my.server.com')
consumer2 = KafkaConsumer('my-topic',
group_id='my-group',
bootstrap_servers='my.server.com')
Does this mean I can create a separate consumer for each process that I spawn? Also, will there be an overlap on the messages being consumed by consumer1 and consumer2 ?
Thanks
Yes, you can create multiple consumers in multiple threads/processes (and even run them in parallel on different machines). As long as all consumers use the same group.id
, there will be no overlap. Kafka assigns each topic partition to a single consumer within a consumer group. Be aware, that using more consumers than available topic partitions will result in idle consumers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With