Is there any technical reason why spark 2.3 does not work with java 1.10 (as of July 2018)?
Here is the output when I run SparkPi example using spark-submit
.
$ ./bin/spark-submit ./examples/src/main/python/pi.py
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.hadoop.security.authentication.util.KerberosUtil to method sun.security.krb5.Config.getInstance()
WARNING: Please consider reporting this to the maintainers of org.apache.hadoop.security.authentication.util.KerberosUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2018-07-13 14:31:30 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-07-13 14:31:31 INFO SparkContext:54 - Running Spark version 2.3.1
2018-07-13 14:31:31 INFO SparkContext:54 - Submitted application: PythonPi
2018-07-13 14:31:31 INFO Utils:54 - Successfully started service 'sparkDriver' on port 58681.
2018-07-13 14:31:31 INFO SparkEnv:54 - Registering MapOutputTracker
2018-07-13 14:31:31 INFO SparkEnv:54 - Registering BlockManagerMaster
2018-07-13 14:31:31 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-07-13 14:31:31 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-07-13 14:31:31 INFO DiskBlockManager:54 - Created local directory at /private/var/folders/mp/9hp4l4md4dqgmgyv7g58gbq0ks62rk/T/blockmgr-d24fab4c-c858-4cd8-9b6a-97b02aa630a5
2018-07-13 14:31:31 INFO MemoryStore:54 - MemoryStore started with capacity 434.4 MB
2018-07-13 14:31:31 INFO SparkEnv:54 - Registering OutputCommitCoordinator
...
2018-07-13 14:31:32 INFO StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint
Traceback (most recent call last):
File "~/Documents/spark-2.3.1-bin-hadoop2.7/./examples/src/main/python/pi.py", line 44, in <module>
count = spark.sparkContext.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/rdd.py", line 862, in reduce
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/rdd.py", line 834, in collect
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
File "~/Documents/spark-2.3.1-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.lang.IllegalArgumentException
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.xbean.asm5.ClassReader.<init>(Unknown Source)
at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:46)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:449)
at org.apache.spark.util.FieldAccessFinder$$anon$3$$anonfun$visitMethodInsn$2.apply(ClosureCleaner.scala:432)
at scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashMap$$anon$1$$anonfun$foreach$2.apply(HashMap.scala:103)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
at scala.collection.mutable.HashMap$$anon$1.foreach(HashMap.scala:103)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
at org.apache.spark.util.FieldAccessFinder$$anon$3.visitMethodInsn(ClosureCleaner.scala:432)
at org.apache.xbean.asm5.ClassReader.a(Unknown Source)
at org.apache.xbean.asm5.ClassReader.b(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.xbean.asm5.ClassReader.accept(Unknown Source)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:262)
at org.apache.spark.util.ClosureCleaner$$anonfun$org$apache$spark$util$ClosureCleaner$$clean$14.apply(ClosureCleaner.scala:261)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:261)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:159)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2299)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2073)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:939)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.collect(RDD.scala:938)
at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:162)
at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:844)
2018-07-13 14:31:33 INFO SparkContext:54 - Invoking stop() from shutdown hook
...
I resolved the issue by switching to Java8 instead of Java10 as mentioned here.
Spark runs on Java 8/11, Scala 2.12, Python 2.7+/3.4+ and R 3.1+. Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.
Warning - Java 9+ is not supported by Spark 2. You can optionally use Spark 3.
Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+.
Description. Apache Spark supports Java 8 and Java 11 (LTS). The next Java LTS version is 17. Version Release Date Java 17 (LTS) September 2021 Apache Spark has a release plan and `Spark 3.2 Code freeze` was July along with the release branch cut.
Spark has been called a “general purpose distributed data processing engine”1 and “a lightning fast unified analytics engine for big data and machine learning” ². It lets you process big data sets faster by splitting the work up into chunks and assigning those chunks across computational resources.
When would you not want to use Spark? For multi-user systems, with shared memory, Hive may be a better choice ². For real time, low latency processing, you may prefer Apache Kafka ⁴. With small data sets, it’s not going to give you huge gains, so you’re probably better off with the typical libraries and tools.
Then illegalargument exception will raise. When closed files has given as argument for a method to read that file. That means argument is invalid. When a method needs non empty string as a parameter but null string is passed. Like this when invalid arguments given then it will raise illegal argument exception.
High speed data querying, analysis, and transformation with large data sets. Compared to MapReduce, Spark offers much less reading and writing to and from the disk, multi-threaded tasks (from Wikipedia: the threads share the resources of a single or multiple cores) within Java Virtual Machine (JVM) processes
Primary technical reason is that Spark depends heavily on direct access to native memory with sun.misc.Unsafe
, which has been made private in Java 9.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With