How to I exactly get the acknowledgement from Kafka once the message is consumed or processed. Might sound stupid but is there any way to know the start and end offset of that message for which the ack has been received ?
In order to write data to the Kafka cluster, the producer has another choice of acknowledgment. It means the producer can get a confirmation of its data writes by receiving the following acknowledgments: acks=0: This means that the producer sends the data to the broker but does not wait for the acknowledgement.
Once the messages are processed, consumer will send an acknowledgement to the Kafka broker. Once Kafka receives an acknowledgement, it changes the offset to the new value and updates it in the Zookeeper.
'acks=1' With a setting of 1 , the producer will consider the write successful when the leader receives the record. The leader broker will know to immediately respond the moment it receives the record and not wait any longer. The producer waits for a response.
What I found so far is in 0.8 they have introduced the following way to choose from the offset for reading ..
kafka.api.OffsetRequest.EarliestTime() finds the beginning of the data in the logs and starts streaming from there, kafka.api.OffsetRequest.LatestTime() will only stream new messages.
example code https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
Still not sure about the acknowledgement part
Kafka isn't really structured to do this. To understand why, review the design documentation here.
In order to provide an exactly-once acknowledgement, you would need to create some external tracking system for your application, where you explicitly write acknowledgements and implement locks over the transaction id's in order to ensure things are only ever processed once. The computational cost of implementing such as system is extraordinarily high, and is one of the main reasons that large transactional systems require comparatively exotic hardware and have arguably lower scalability than systems such as Kafka.
If you do not require strong durability semantics, you can use the groups API to keep rough track of when the last message was read. This ensures that every message is read at least once. Note that since the groups API does not provide you the ability to explicitly track your applications own processing logic, that your actual processing guarantees are fairly weak in this scenario. Schemes that rely on idempotent processing are common in this environment.
Alternatively, you may use the poorly-named SimpleConsumer API (it is quite complex to use), which enables you to explicitly track timestamps within your application. This is the highest level of processing guarantee that can be achieved through the native Kafka API's since it enables you to track your applications own processing of the data that is read from the queue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With