I wish to read only latest msg in spark streaming using kafka, but it also fetches past data
How to set auto.offset.reset in KafkaUtil for spark
JavaPairReceiverInputDStream<String, String> messages =
KafkaUtils.createStream(jssc, args[0], args[1], topicMap);
how to set the conf to fetching only current message . Please give some example.
Thanks in advance, there is also another thread
But not sufficient, pls help me out. Thanks in advance.
You need to use this method from KafkaUtils object:
def createStream[K, V, U <: Decoder[_], T <: Decoder[_]](
jssc: JavaStreamingContext,
keyTypeClass: Class[K],
valueTypeClass: Class[V],
keyDecoderClass: Class[U],
valueDecoderClass: Class[T],
kafkaParams: JMap[String, String],
topics: JMap[String, JInt],
storageLevel: StorageLevel
)
Depending on the Spark version, you cannot use java. There is a bug.
If you are using Spark 1.1.0, you need to add into kafkaParams parameter this property:
"auto.offset.reset", "largest"
Another workaround is generate a groupId prefix randomly, but this is crappy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With