i have build a jar file from my spark app with maven (mvn clean compile assembly:single) and the following pom file:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>mgm.tp.bigdata</groupId>
<artifactId>ma-spark</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>ma-spark</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.1.0-cdh5.2.5</version>
</dependency>
<dependency>
<groupId>mgm.tp.bigdata</groupId>
<artifactId>ma-commons</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>mgm.tp.bigdata.ma_spark.SparkMain</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
</plugins>
</build>
</project>
if i run my app with java -jar ma-spark-0.0.1-SNAPSHOT-jar-with-dependencies.jar on terminal, i get the following error message:
VirtualBox:~/Schreibtisch$ java -jar ma-spark-0.0.1-SNAPSHOT-jar-with-dependencies.jar
2015-Jun-02 12:53:36,348 [main] org.apache.spark.util.Utils
WARN - Your hostname, proewer-VirtualBox resolves to a loopback address: 127.0.1.1; using 10.0.2.15 instead (on interface eth0)
2015-Jun-02 12:53:36,350 [main] org.apache.spark.util.Utils
WARN - Set SPARK_LOCAL_IP if you need to bind to another address
2015-Jun-02 12:53:36,401 [main] org.apache.spark.SecurityManager
INFO - Changing view acls to: proewer
2015-Jun-02 12:53:36,402 [main] org.apache.spark.SecurityManager
INFO - Changing modify acls to: proewer
2015-Jun-02 12:53:36,403 [main] org.apache.spark.SecurityManager
INFO - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(proewer); users with modify permissions: Set(proewer)
Exception in thread "main" com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'akka.version'
at com.typesafe.config.impl.SimpleConfig.findKey(SimpleConfig.java:115)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:136)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:142)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:150)
at com.typesafe.config.impl.SimpleConfig.find(SimpleConfig.java:155)
at com.typesafe.config.impl.SimpleConfig.getString(SimpleConfig.java:197)
at akka.actor.ActorSystem$Settings.<init>(ActorSystem.scala:136)
at akka.actor.ActorSystemImpl.<init>(ActorSystem.scala:470)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1454)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1450)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:156)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:203)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:53)
at mgm.tp.bigdata.ma_spark.SparkMain.main(SparkMain.java:38)
what i do wrong?
best regards, paul
This is what you are doing wrong :
i run my app with java -jar ma-spark-0.0.1-SNAPSHOT-jar-with-dependencies.jar
Once you have your application build, your should launch it using the spark-submit script. This script takes care of setting up the classpath with Spark and its dependencies, and can support different cluster managers and deploy modes that Spark supports:
./bin/spark-submit \
--class <main-class>
--master <master-url> \
--deploy-mode <deploy-mode> \
--conf <key>=<value> \
... # other options
<application-jar> \
[application-arguments]
I strongly advice your to read the official documentation about Submitting Application.
It is most likely because the akka conf file from akka jar got overridden or missed while packaging the fat jar.
You can try another plug-in called maven-shade-plugin. And in the pom.xml you need to specify how to solve conflicts of resources with the same name. Below is an example -
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<minimizeJar>false</minimizeJar>
<createDependencyReducedPom>false</createDependencyReducedPom>
<artifactSet>
<includes>
<!-- Include here the dependencies you want to be packed in your fat jar -->
<include>my.package.etc....:*</include>
</includes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
Please note the <transformers>
section where it is instructing the shade plugin to append the content, instead of replacing.
This worked for me.
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<shadedArtifactAttached>true</shadedArtifactAttached>
<shadedClassifierName>allinone</shadedClassifierName>
<artifactSet>
<includes>
<include>*:*</include>
</includes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>reference.conf</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.handlers</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.schemas</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<manifestEntries>
<Main-Class>com.echoed.chamber.Main</Main-Class>
</manifestEntries>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
ConfigException$Missing error indicates that akka config file i.e., reference.conf
file is not bundled in application jar file. Reason could be that when there are multiple files available with the same name in different dependent jar's, default strategy will check to see if they all of are same. If not, then it'll omit that file.
I had the same issue and I resolved it as follows:
Generate merged reference.conf using AppendingTransformer: By merged reference.conf file, what I mean is that all the dependent modules such as akka-core, akka-http, akka-remoting etc containing resource named reference.conf are appended together by AppendingTransformer. We add AppendingTransformer in pom file as follows:
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer"> <resource>reference.conf</resource> </transformer>
mvn clean install
will now generate fat jar with merged reference.conf file.
Still same error: spark-submit <main-class> <app.jar>
still gave the same error when i deployed my spark-app in EMR.
Reason: Since HDFS is the configured filesystem, Spark jobs on EMR cluster reads from HDFS by default. So, the file you want to use must already exist in HDFS. I've added reference.conf file to hdfs using following approach:
1. Extract reference.conf file from app.jar into /tmp folder
`cd /tmp`
`jar xvf path_to_application.jar reference.conf`
2. Copy extracted reference.conf from local-path (in this case /tmp) to HDFS-path (ex: /user/hadoop)
`hdfs dfs -put /tmp/reference.conf /user/hadoop`
3. Load config as follows:
`val parsedConfig = ConfigFactory.parseFile(new File("/user/hadoop/reference.conf"))`
`val config = COnfigFactory.load(par)`
Alternate solution:
ConfigFactory.parseFile(new File(“file:///tmp/reference.conf”))
will now read reference.conf from local file system. Hope that helps and saves some debugging time for you guys!!If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With