I've created a Apache Spark application using Java. All it does is counting the lines containing the "spark" word 1000 times.
Here's my code:
public class Example1 {
public static void main(String[] args) {
String logfile = args[0];
try{
SparkConf conf = new SparkConf();
conf.setAppName("Sample");
conf.setMaster("spark://<master>:7077");
conf.set("spark.executor.memory", "1g");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> logData = sc.textFile(logfile).cache();
long count = 0;
for(int i=0; i<=1000; i++){
count += logData.filter(new Function<String, Boolean>(){
public Boolean call(String s){
if (s.toLowerCase().contains("spark"))
return true;
else
return false;
}
}).count();
}
}
catch(Exception ex){
System.out.println(ex.getMessage());
}
}
}
When I perform a debug in Eclipse IDE, I am encountering java.lang.ClassNotFoundException
:
WARN scheduler.TaskSetManager: Loss was due to java.lang.ClassNotFoundException
java.lang.ClassNotFoundException: org.spark.java.examples.Example1$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
I also tried to deploy this inside the cluster using spark-submit
, but still, the same exception was encountered. Here's a portion of the stacktrace:
ERROR Executor: Exception in task ID 4
java.lang.ClassNotFoundException: org.spark.java.examples.Example1$1
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
Any ideas on how to resolve this? Thanks in advance!
You need to deliver the jar with your job to the workers. To do that, have maven build a jar and add that jar to the context:
conf.setJars(new String[]{"path/to/jar/Sample.jar"}); [*]
For a 'real' job you would need to build a jar with dependencies (check Maven shade plugin), but for a simple job with no external dependencies, a simple jar is sufficient.
[*] I'm not very familiar with the Spark java API, just assuming it should be something like this.
You must include your jar in the worker's classpath. You can do this in two ways:
The first one is the recommended method.
This can also happen if you do not specify the full package name when using spark-submit
command line. If your main
method for the application is in test.spark.SimpleApp
then the command line needs to look something like this:
./bin/spark-submit --class "test.spark.SimpleApp" --master local[2] /path_to_project/target/spark_testing-1.0-SNAPSHOT.jar
Adding just --class "SimpleApp"
will fail with ClassNotFoundException
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With