Running a Spark SQL (v2.1.0_2.11) program in Java immediately fails with the following exception, as soon as the first action is called on a dataframe:
java.lang.ClassNotFoundException: org.codehaus.commons.compiler.UncheckedCompileException
I ran it in Eclipse, outside of the spark-submit
environment. I use the following Spark SQL Maven dependency:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.0</version>
<scope>provided</scope>
</dependency>
The culprit is the library commons-compiler
. Here is the conflict:
To work around this, add the following to your pom.xml:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
<version>2.7.8</version>
</dependency>
</dependencies>
</dependencyManagement>
I had the similar issues, when updated spark-2.2.1 to spark-2.3.0.
In my case, I had to fix commons-compiler and janino
Spark 2.3 solution:
<dependencyManagement>
<dependencies>
<!--Spark java.lang.NoClassDefFoundError: org/codehaus/janino/InternalCompilerException-->
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
<version>3.0.8</version>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>3.0.8</version>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
<version>3.0.8</version>
</dependency>
<dependency>
<groupId>org.codehaus.janino</groupId>
<artifactId>janino</artifactId>
<version>3.0.8</version>
</dependency>
</dependencies>
If you are using the Spark 3.0.1
version or higher, you have to select version 3.0.16
for the two janino
dependencies for the @Maksym solution that works very well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With