Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Maven assembly works when SBT assembly find conflicts

The title could also be:
What are the differences between Maven and SBT assembly plugins.

I have found this to be an issue, while migrating a project from Maven to SBT.

To describe the problem I have created an example project with dependencies that I found to behave differently, depending on the build tool.

https://github.com/atais/mvn-sbt-assembly


The only dependencies are (sbt style)

"com.netflix.astyanax" % "astyanax-cassandra" % "3.9.0",
"org.apache.cassandra" % "cassandra-all" % "3.4",

and what I do not understand is, why mvn package creates the fat jar successfully, while sbt assembly gives conflicts:

[error] 39 errors were encountered during merge
[error] java.lang.RuntimeException: deduplicate: different file contents found in the following:
[error] /home/siatkowskim/.ivy2/cache/org.slf4j/jcl-over-slf4j/jars/jcl-over-slf4j-1.7.7.jar:org/apache/commons/logging/<some classes>
[error] /home/siatkowskim/.ivy2/cache/commons-logging/commons-logging/jars/commons-logging-1.1.1.jar:org/apache/commons/logging/<some classes>
...
[error] /home/siatkowskim/.ivy2/cache/com.github.stephenc.high-scale-lib/high-scale-lib/jars/high-scale-lib-1.1.2.jar:org/cliffc/high_scale_lib/<some classes>
[error] /home/siatkowskim/.ivy2/cache/com.boundary/high-scale-lib/jars/high-scale-lib-1.0.6.jar:org/cliffc/high_scale_lib/<some classes>
...
like image 334
Atais Avatar asked May 09 '18 09:05

Atais


2 Answers

It seems maven-assembly-plugin resolves conflicts equivalently to MergeStrategy.first (not sure if it's completely equivalent) by just picking one of the files in an unspecified way when jar-with-dependencies is used (since it only has one phase):

If two or more elements (e.g., file, fileSet) select different sources for the same file for archiving, only one of the source files will be archived.

As per version 2.5.2 of the assembly plugin, the first phase to add the file to the archive "wins". The filtering is done solely based on name inside the archive, so the same source file can be added under different output names. The order of the phases is as follows: 1) FileItem 2) FileSets 3) ModuleSet 4) DepenedencySet and 5) Repository elements.

Elements of the same type will be processed in the order they appear in the descriptors. If you need to "overwrite" a file included by a previous set, the only way to do this is to exclude that file from the earlier set.

Note that this behaviour was slightly different in earlier versions of the assembly plugin.

Even if one of the conflicting files would work for all of your dependencies (which isn't necessarily so), Maven doesn't know which one, so you can just silently get the wrong result. Silently at build-time, I mean; at runtime you can get e.g. AbstractMethodError, or again just a wrong result.

You can influence which file gets picked by writing your own descriptor, but it's horribly verbose, there's no equivalent to just writing MergeStrategy.first/last (and concat/discard are not allowed).

The SBT plugin could do the same: default to a strategy when you don't specify one, but then, well, you could silently get the wrong result.

like image 183
Alexey Romanov Avatar answered Oct 09 '22 16:10

Alexey Romanov


Extension to Alexey Romanov answer.

I have also updated my project with detailed explanation, so you might want to check it out.

Following the advice

You can verify it for this case by unpacking the jar Maven produces and the dependency jars in SBT error message, then checking which .class file Maven used.

I compared the fat-jars produced by maven and sbt with

  • MergeStrategy.first, that showed some extra files
  • MergeStrategy.last, that showed binary differences & extra files

I have taken the next step and checked the fat-jars against the dependencies sbt found conflicts at, specifically:

  • jcl-over-slf4j-1.7.7.jar
  • commons-logging-1.1.1.jar

Conclusion

maven-assembly-plugin resolves conflicts on jar level. When it finds any conflict, it picks the first jar and simply ignores all the content from the other.

Whereas sbt-assembly mixes all the class files, resolving conflicts locally, file by file.

My theory would be, that if your fat-jar made with maven-assembly-plugin works, you can specify MergeStrategy.first for all the conflicts in sbt. They only difference would be, that the jar produced with sbt will be even bigger, containing extra classes that were ignored by maven.

like image 22
Atais Avatar answered Oct 09 '22 15:10

Atais