I was googling and reading Kafka documentation but I couldn't find out the max value of a consumer offset and whether there is offset wraparound after max value. I understand offset is an Int64 value so max value is 0xFFFFFFFFFFFFFFFF. If there is wraparound, how does Kafka handle this situation?
Fundamentally, the only maximum offset imposed by Kafka is that it has to be a 64-bit value. So it could be as large as LONG_MAX.
Consumer offset is recorded in Kafka so if the consumer processing the partition in the consumer group goes down and when the consumer comes back, the consumer will read the offset to start reading the messages from the topic from where it is left off. This avoids duplication in message consumption.
The offset is a simple integer number that is used by Kafka to maintain the current position of a consumer. That's it. The current offset is a pointer to the last record that Kafka has already sent to a consumer in the most recent poll. So, the consumer doesn't get the same record twice because of the current offset.
How to change consumer offset? Use the kafka-consumer-groups.sh to change or reset the offset. You would have to specify the topic, consumer group and use the –reset-offsets flag to change the offset.
According to this post, the offset is not reset:
We don't roll back offset at this moment. Since the offset is a long, it can last for a really long time. If you write 1TB a day, you can keep going for about 4 million days.
Plus, you can always use more partitions (each partition has its own offset).
So as Luciano said, probably not worth worrying about.
It seems that this is not really "handled". But, taking into account that the offset is per partition, it seems this is something we should not worry about :)
Please see http://search-hadoop.com/m/uyzND1uRn8D1sSH322/rollover/v=threaded
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With