I encounter this problem while running an automated data processing script in spark-shell. First couple of iterations work fine, but it always sooner or later bumps into this error. I googled this issue but haven't found an exact match. Other similar issues are outside of spark context. I guess it may have something to do with JVM version, but I cannot figure out how to solve the problem.
I used 2 machines within a spark standalone cluster.
Machine No.1 Java Information:
java 10.0.2 2018-07-17
Java(TM) SE Runtime Environment 18.3 (build 10.0.2+13)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)
Machine No.2 Java Information:
openjdk 10.0.2 2018-07-17
OpenJDK Runtime Environment (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.4)
OpenJDK 64-Bit Server VM (build 10.0.2+13-Ubuntu-1ubuntu0.18.04.4, mixed mode)
Error Information:
WARN TaskSetManager:66 - Lost task 3.0 in stage 28.0 (TID 1368, 169.254.115.145, executor 1):
java.lang.NoSuchMethodError: sun.nio.ch.DirectBuffer.cleaner()Lsun/misc/Cleaner;
at org.apache.spark.storage.StorageUtils$.cleanDirectBuffer(StorageUtils.scala:212)
at org.apache.spark.storage.StorageUtils$.dispose(StorageUtils.scala:207)
at org.apache.spark.storage.StorageUtils.dispose(StorageUtils.scala)
at org.apache.spark.io.NioBufferedFileInputStream.close(NioBufferedFileInputStream.java:130)
at java.base/java.io.FilterInputStream.close(FilterInputStream.java:180)
at org.apache.spark.io.ReadAheadInputStream.close(ReadAheadInputStream.java:400)
at org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillReader.close(UnsafeSorterSpillReader.java:152)
at org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillReader.loadNext(UnsafeSorterSpillReader.java:124)
at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter$SpillableIterator.loadNext(UnsafeExternalSorter.java:590)
at org.apache.spark.sql.execution.UnsafeKVExternalSorter$KVSorterIterator.next(UnsafeKVExternalSorter.java:287)
at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.findNextSortedGroup(ObjectAggregationIterator.scala:276)
at org.apache.spark.sql.execution.aggregate.SortBasedAggregator$$anon$1.hasNext(ObjectAggregationIterator.scala:247)
at org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.hasNext(ObjectAggregationIterator.scala:81)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:409)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:148)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1135)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:844)
I managed to solve the problem by setting JAVA_HOME for spark to a java8 JDK. It is a quite new issue but has been spotted by the developers of spark, see here https://github.com/apache/spark/pull/22993/files/7f58ae61262d7c2f2d70c24d051c63e8830d5062.
The latest pre-compiled spark provided by the official site was released on Nov 2 and this pull request happened later. Hopefully later release would avoid this issue with java of newer version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With