Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

UnsatisfiedLinkError: no snappyjava in java.library.path when running Spark MLLib Unit test within Intellij

The following exception is occurring when running a spark unit test that requires snappy compression:

java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
    at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
    at org.xerial.snappy.Snappy.<clinit>(Snappy.java:44)
    at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:150)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
    at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
    at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
    at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
    at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:79)
    at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
    at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
    at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1077)
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:849)
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:790)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:793)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:792)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:792)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:793)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$submitStage$4.apply(DAGScheduler.scala:792)
    at scala.collection.immutable.List.foreach(List.scala:318)
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:792)
    at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:774)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1385)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path
    at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1878)
    at java.lang.Runtime.loadLibrary0(Runtime.java:849)
    at java.lang.System.loadLibrary(System.java:1087)
    at org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
    ... 33 more

What settings or changes are required to fix the issue?

like image 553
WestCoastProjects Avatar asked May 04 '15 21:05

WestCoastProjects


4 Answers

Another solution is to upgrade your version of snappy. While this problem exists in 1.0.4.1, it was fixed in 1.0.5. Adding an exclusion in spark dependencies like

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>${spark.version}</version>
    <exclusions>
        <exclusion>
           <groupId>org.xerial.snappy</groupId>
           <artifactId>snappy-java</artifactId>
        </exclusion>
    </exclusions>
</dependency>

and then adding

<dependency>
    <groupId>org.xerial.snappy</groupId>
    <artifactId>snappy-java</artifactId>
    <version>1.0.5</version>
</dependency>

did it for me.

like image 74
Sean Avatar answered Oct 17 '22 17:10

Sean


The way to handle this is to update the Intellij Run Configuration. Add the following to the JVM parameters:

-Dorg.xerial.snappy.lib.name=libsnappyjava.jnilib -Dorg.xerial.snappy.tempdir=/tmp 
like image 39
WestCoastProjects Avatar answered Oct 17 '22 16:10

WestCoastProjects


Faced this problem with clean standalone installation of Spark 1.6.1. To solve it I had to:

1) manually add libsnappyjava.jnilib (it's in the jar) to java.library.path (which includes multiple locations, ~/Library/Java/Extensions/ is fine)

2) add snappy-java-1.1.2.4.jar to Spark's classpath (in spark-env.sh add "export SPARK_CLASSPATH=.../snappy-java-1.1.2.4.jar"

like image 38
Maxim Konovalov Avatar answered Oct 17 '22 18:10

Maxim Konovalov


I experienced the same error. the version of the spark-core was: 1.3.0-cdh5.4.3

once i changed it to: 1.3.0 it fixed it.

note that it is "provided" so on production it doesn't matter, its only for development machine.

edit: i found a more reasonable solution. the problem is resulted from a bug in the snappy compression of Java in OSX. so to resolve it, you can add to your pom file:

<dependency>
    <groupId>org.xerial.snappy</groupId>
    <artifactId>snappy-java</artifactId>
    <version>1.1.2</version>
    <type>jar</type>
    <scope>provided</scope>
</dependency>
like image 26
Liran Brimer Avatar answered Oct 17 '22 17:10

Liran Brimer