Including a Spark Package JAR file in a SBT generated fat JAR

Tags:

The spark-daria project is uploaded to Spark Packages and I'm accessing spark-daria code in another SBT project with the sbt-spark-package plugin.

I can include spark-daria in the fat JAR file generated by sbt assembly with the following code in the build.sbt file.

spDependencies += "mrpowers/spark-daria:0.3.0"

val requiredJars = List("spark-daria-0.3.0.jar")
assemblyExcludedJars in assembly := {
  val cp = (fullClasspath in assembly).value
  cp filter { f =>
    !requiredJars.contains(f.data.getName)
  }
}

This code feels like a hack. Is there a better way to include spark-daria in the fat JAR file?

N.B. I want to build a semi-fat JAR file here. I want spark-daria to be included in the JAR file, but I don't want all of Spark in the JAR file!

715

asked May 17 '17 23:05

Powers

1 Answers

The README for version 0.2.6 states the following:

In any case where you really can't specify Spark dependencies using sparkComponents (e.g. you have exclusion rules) and configure them as provided (e.g. standalone jar for a demo), you may use spIgnoreProvided := true to properly use the assembly plugin.

You should then use this flag on your build definition and set your Spark dependencies as provided as I do with spark-sql:2.2.0 in the following example:

libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0" % "provided"

Please note that by setting this your IDE may no longer have the necessary dependencies references to compile and run your code locally, which would mean that you would have to add the necessary JARs to the classpath by hand. I do this often on IntelliJ, what I do is having a Spark distribution on my machine and adding its jars directory to the IntelliJ project definition (this question may help you with that, should you need it).

108

answered Oct 21 '22 05:10

stefanobaghino

Related questions
                            
                                ConcurrentModificationException when using Spark collectionAccumulator
                            
                                What does the remote-actor framework do if trying to write to a client which is no longer there?
                            
                                Is there a PHP benchmark that meets these specific criteria? [closed]
                            
                                Why does Scala fail to instantiate a Companion Object?
                            
                                How can I run Android tests with sbt?
                            
                                How can I make ensime show all compilation errors?
                            
                                Best practice: catching failure points in java.net.URL
                            
                                Intellij-idea 12 Scala support : it consumes almost 300% of my cpu resources
                            
                                Mark import as used for IntelliJ
                            
                                Is there a way to prevent Play from auto-reloading?
                            
                                Why doesn't Unit extend Product in Scala?
                            
                                How to recover messages in Akka Actors now that Durable Mailboxes are removed?
                            
                                Scala Intellij breakpoints ignored
                            
                                Efficiently manipulating subsets of RDD's keys in spark
                            
                                Scala None instance not == None
                            
                                The object-functional impedance mismatch
                            
                                Akka-streams: how to get flow names in metrics reported by kamon-akka
                            
                                Implementing a Cake Pattern with implicit functionality
                            
                                No implementation for OWrites and Reads was bound in Scala Play app

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Including a Spark Package JAR file in a SBT generated fat JAR

Tags:

scala

sbt

apache-spark

sbt-assembly

spark-packages

Powers

People also ask

1 Answers

stefanobaghino

Recent Activity

Donate For Us