I have a spark job packaged as an uber-jar using the sbt assembly plugin.
The build.sbt
specifies a runnable main to be the target of the resulting uber-jar
mainClass in assembly := Some("com.foo.Bar")
After the assembly is correctly created, running the intended command:
java -jar assembly.jar
results in
Error: Could not find or load main class com.foo.Bar
Using the an alternative method, like java -cp assembly.jar com.foo.Bar
gives the same error message.
Then, I extracted the contents of the uber-jar in a new directory. I can see my com/foo/
directory and the Bar.class
file.
From the root of the extracted directory I tried:
java -cp . com.foo.Bar
and I get a correct result.
Further trying to find the reason of the error, I tried:
java -verbose -jar assembly.jar
I can see the java core classes being loaded, but I don't see any of my packaged classes being loaded.
What can possibly be wrong here?
After an extensive investigation (read: pulling hairs out), it turns out that this behavior is the result of a rogue INDEX.LIST
from one of the flattened jar files landing in the META-INF
directory of the resulting uber-jar.
Following the JAR file spec, the INDEX.LIST
, if present, dictates what packages from the Jar file are to be loaded.
To avoid this, we updated the mergeStrategy
with a rule to avoid any pollution of the resulting META-INF
directory:
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
This fixed the issue and returned my sanity.
Update:
After some extra searching, it turns out that the default merge strategy takes proper care of INDEX.LIST
. This answer applies when the customized merge strategy contains cases that handle the META-INF
pathSpec
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With