I'm having problems with a "ClassNotFound" Exception using this simple example:
import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf import java.net.URLClassLoader import scala.util.Marshal class ClassToRoundTrip(val id: Int) extends scala.Serializable { } object RoundTripTester { def test(id : Int) : ClassToRoundTrip = { // Get the current classpath and output. Can we see simpleapp jar? val cl = ClassLoader.getSystemClassLoader val urls = cl.asInstanceOf[URLClassLoader].getURLs urls.foreach(url => println("Executor classpath is:" + url.getFile)) // Simply instantiating an instance of object and using it works fine. val testObj = new ClassToRoundTrip(id) println("testObj.id: " + testObj.id) val testObjBytes = Marshal.dump(testObj) val testObjRoundTrip = Marshal.load[ClassToRoundTrip](testObjBytes) // <<-- ClassNotFoundException here testObjRoundTrip } } object SimpleApp { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Simple Application") val sc = new SparkContext(conf) val cl = ClassLoader.getSystemClassLoader val urls = cl.asInstanceOf[URLClassLoader].getURLs urls.foreach(url => println("Driver classpath is: " + url.getFile)) val data = Array(1, 2, 3, 4, 5) val distData = sc.parallelize(data) distData.foreach(x=> RoundTripTester.test(x)) } }
In local mode, submitting as per the docs generates a "ClassNotFound" exception on line 31, where the ClassToRoundTrip object is deserialized. Strangely, the earlier use on line 28 is okay:
spark-submit --class "SimpleApp" \ --master local[4] \ target/scala-2.10/simpleapp_2.10-1.0.jar
However, if I add extra parameters for "driver-class-path", and "-jars", it works fine, on local.
spark-submit --class "SimpleApp" \ --master local[4] \ --driver-class-path /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar \ --jars /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/SimpleApp.jar \ target/scala-2.10/simpleapp_2.10-1.0.jar
However, submitting to a local dev master, still generates the same issue:
spark-submit --class "SimpleApp" \ --master spark://localhost.localdomain:7077 \ --driver-class-path /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar \ --jars /home/xxxxxxx/workspace/SimpleApp/target/scala-2.10/simpleapp_2.10-1.0.jar \ target/scala-2.10/simpleapp_2.10-1.0.jar
I can see from the output that the JAR file is being fetched by the executor.
Logs for one of the executor's are here:
stdout: http://pastebin.com/raw.php?i=DQvvGhKm
stderr: http://pastebin.com/raw.php?i=MPZZVa0Q
I'm using Spark 1.0.2. The ClassToRoundTrip is included in the JAR. I would rather not have to hardcode values in SPARK_CLASSPATH or SparkContext.addJar. Can anyone help?
I had this same issue. If master is local then program runs fine for most people. If they set it to (also happened to me) "spark://myurl:7077" it doesn't work. Most people get error because an anonymous class was not found during execution. It is resolved by using SparkContext.addJars ("Path to jar").
Make sure you are doing the following things: -
Note: this jar pathToYourJar/target/yourJarFromMaven.jar in last point is also set in code as in first point of this answer.
I also had same issue. I think --jars is not shipping the jars to executors. After I added this into SparkConf, it works fine.
val conf = new SparkConf().setMaster("...").setJars(Seq("/a/b/x.jar", "/c/d/y.jar"))
This web page for trouble shooting is useful too.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With