Full error:
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)[Ljava/lang/Object; at org.spark_module.SparkModule$.main(SparkModule.scala:62) at org.spark_module.SparkModule.main(SparkModule.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
When I compile and run the code in IntelliJ, it executes fine all the way through. The error shows when I submit the .jar as a spark job (runtime).
Line 62 contains: for ((elem, i) <- args.zipWithIndex). I commented out the rest of the code to be sure, and the error kept showing on that line.
At first I thought it was zipWithIndex's fault. Then I changed it for for (elem <- args) and guess what, the error still showed. Is the for causing this?
Google searching always points to Scala versions incompatibility between version used to compile and version used on runtime but I can't figure out a solution.
I tried this to check Scala version used by IntelliJ and here is everything Scala-related under Modules > Scala:

Then I did this to check the run-time version of Scala and the output is:
(file:/C:/Users/me/.gradle/caches/modules-2/files-2.1/org.scala-lang/scala-library/2.12.11/1a0634714a956c1aae9abefc83acaf6d4eabfa7d/scala-library-2.12.11.jar )
Versions seem to match...
This is my gradle.build (includes fatJar task)
group 'org.spark_module'
version '1.0-SNAPSHOT'
apply plugin: 'scala'
apply plugin: 'idea'
apply plugin: 'eclipse'
repositories {
mavenCentral()
}
idea {
project {
jdkName = '1.8'
languageLevel = '1.8'
}
}
dependencies {
implementation group: 'org.scala-lang', name: 'scala-library', version: '2.12.11'
implementation group: 'org.apache.spark', name: 'spark-core_2.12'//, version: '2.4.5'
implementation group: 'org.apache.spark', name: 'spark-sql_2.12'//, version: '2.4.5'
implementation group: 'com.datastax.spark', name: 'spark-cassandra-connector_2.12', version: '2.5.0'
implementation group: 'org.apache.spark', name: 'spark-mllib_2.12', version: '2.4.5'
implementation group: 'log4j', name: 'log4j', version: '1.2.17'
implementation group: 'org.scalaj', name: 'scalaj-http_2.12', version: '2.4.2'
}
task fatJar(type: Jar) {
zip64 true
from {
configurations.runtimeClasspath.collect { it.isDirectory() ? it : zipTree(it) }
} {
exclude "META-INF/*.SF"
exclude "META-INF/*.DSA"
exclude "META-INF/*.RSA"
}
manifest {
attributes 'Main-Class': 'org.spark_module.SparkModule'
}
with jar
}
configurations.all {
resolutionStrategy {
force 'com.google.guava:guava:12.0.1'
}
}
compileScala.targetCompatibility = "1.8"
compileScala.sourceCompatibility = "1.8"
jar {
zip64 true
getArchiveFileName()
from {
configurations.compile.collect {
it.isDirectory() ? it : zipTree(it)
}
}
manifest {
attributes 'Main-Class': 'org.spark_module.SparkModule'
}
exclude 'META-INF/*.RSA', 'META-INF/*.SF', 'META-INF/*.DSA'
}
To build the (fat) jar:
gradlew fatJar
in IntelliJ's terminal.
To run the job:
spark-submit.cmd .\SparkModule-1.0-SNAPSHOT.jar
in Windows PowerShell.
Thank you
EDIT:
spark-submit.cmd and spark-shell.cmd both show Scala version 2.11.12, so yes, they differ from the one I am using in IntelliJ (2.12.11). The problem is, in Spark's download page, there is only one Spark distribution for Scala 2.12 and it comes without Hadoop; does it mean I have to downgrade from 2.12 to 2.11 in my gradle.build?
I would try spark-submit --version to know what scala version is using spark
With spark-submit --version I get this information
[cloudera@quickstart scala-programming-for-data-science]$ spark-submit --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.0.cloudera4
/_/
Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_202
Branch HEAD
Compiled by user jenkins on 2018-09-27T02:42:51Z
Revision 0ef0912caaab3f2636b98371eb29adb42978c595
Url git://github.mtv.cloudera.com/CDH/spark.git
Type --help for more information.
from the spark-shell you could try this to know the scala version
scala> util.Properties.versionString
res3: String = version 2.11.8
The OS could be using other scala version, in my case as you can see spark scala version and OS scala version are different
[cloudera@quickstart scala-programming-for-data-science]$ scala -version
Scala code runner version 2.12.8 -- Copyright 2002-2018, LAMP/EPFL and Lightbend, Inc.
Note From O'Really Learning Spark "Holden Karau, Andy Konwinski,Patrick Wendell & Matei Zaharia"
Dependency Conflicts
One occasionally disruptive issue is dealing with dependency conflicts in cases where
a user application and Spark itself both depend on the same library. This comes up
relatively rarely, but when it does, it can be vexing for users. Typically, this will manifest
itself when a NoSuchMethodError, a ClassNotFoundException, or some other
JVM exception related to class loading is thrown during the execution of a Spark job.
There are two solutions to this problem. The first is to modify your application to
depend on the same version of the third-party library that Spark does. The second is
to modify the packaging of your application using a procedure that is often called
“shading.” The Maven build tool supports shading through advanced configuration
of the plug-in shown in Example 7-5 (in fact, the shading capability is why the plugin
is named maven-shade-plugin). Shading allows you to make a second copy of the
conflicting package under a different namespace and rewrites your application’s code
to use the renamed version. This somewhat brute-force technique is quite effective at
resolving runtime dependency conflicts. For specific instructions on how to shade
dependencies, see the documentation for your build tool.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With