i use spark-streaming 2.2.0 with python. and read data from kafka(2.11-0.10.0.0) cluster. and i submit a python script with spark-submit --jars spark-streaming-kafka-0-8-assembly_2.11-2.2.0.jar hodor.py the spark report a error message
17/08/04 10:52:00 ERROR Utils: Uncaught exception in thread stdout
writer for python
java.lang.NoSuchMethodError: net.jpountz.util.Utils.checkRange([BII)V
at org.apache.kafka.common.message.KafkaLZ4BlockInputStream.read(KafkaLZ4BlockInputStream.java:176)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at kafka.message.ByteBufferMessageSet$$anonfun$decompress$1.apply$mcI$sp(ByteBufferMessageSet.scala:67)
at kafka.message.ByteBufferMessageSet$$anonfun$decompress$1.apply(ByteBufferMessageSet.scala:67)
at kafka.message.ByteBufferMessageSet$$anonfun$decompress$1.apply(ByteBufferMessageSet.scala:67)
at scala.collection.immutable.Stream$.continually(Stream.scala:1279)
at kafka.message.ByteBufferMessageSet$.decompress(ByteBufferMessageSet.scala:67)
at kafka.message.ByteBufferMessageSet$$anon$1.makeNextOuter(ByteBufferMessageSet.scala:179)
at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:192)
at kafka.message.ByteBufferMessageSet$$anon$1.makeNext(ByteBufferMessageSet.scala:146)
at kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
at scala.collection.Iterator$$anon$18.hasNext(Iterator.scala:764)
at org.apache.spark.streaming.kafka.KafkaRDD$KafkaRDDIterator.getNext(KafkaRDD.scala:214)
at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at org.apache.spark.util.NextIterator.foreach(NextIterator.scala:21)
at org.apache.spark.api.python.PythonRDD$.writeIteratorToStream(PythonRDD.scala:509)
at org.apache.spark.api.python.PythonRunner$WriterThread$$anonfun$run$3.apply(PythonRDD.scala:333)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1954)
at org.apache.spark.api.python.PythonRunner$WriterThread.run(PythonRDD.scala:269)
i think it maybe caused with lz4 version conflict. spark depend on net.jpountz.lz4 1.3.0 but kafka depend on net.jpountz.lz4 1.2.0
how can i fix it?
I had the same issue when upgrading to spark 2.3.0 (join two streams with Structured Streaming)
This is the error:
NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.init<>
I am using Maven, so the solution was to add the oldest dependency of the two (spark 2.3 used 1.4.0, kafka used 1.3.0), to the pom.xml of my main project.
<dependency>
<groupId>net.jpountz.lz4</groupId>
<artifactId>lz4</artifactId>
<version>1.3.0</version>
</dependency>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With