I have a situation in Kafka where the producer publishes the messages at a very higher rate than the consumer consumption rate. I have to implement the back pressure implementation in kafka for further consumption and processing.
Please let me know how can I implement in spark and also in normal java api.
Kafka acts as the regulator here. You produce at whatever rate you want to into Kafka, scaling the brokers out to accommodate the ingest rate. You then consume as you want to; Kafka persists the data and tracks the offset of the consumers as they work their way through the data they read.
You can disable auto-commit by enable.auto.commit=false
on consumer and commit only when consumer operation is finished. That way consumer would be slow, but Kafka knows how many messages consumer processed, also configuring poll interval with max.poll.interval.ms
and messages to be consumed in each poll with max.poll.records
you should be good.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With