Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Failed to load class for data source: com.databricks.spark.csv

Tags:

apache-spark

My build.sbt file has this:

scalaVersion := "2.10.3"
libraryDependencies += "com.databricks" % "spark-csv_2.10" % "1.1.0"

I am running Spark in standalone cluster mode and my SparkConf is SparkConf().setMaster("spark://ec2-[ip].compute-1.amazonaws.com:7077").setAppName("Simple Application") (I am not using the method setJars, not sure whether I need it).

I package the jar using the command sbt package. Command I use to run the application is ./bin/spark-submit --master spark://ec2-[ip].compute-1.amazonaws.com:7077 --class "[classname]" target/scala-2.10/[jarname]_2.10-1.0.jar.

On running this, I get this error:

java.lang.RuntimeException: Failed to load class for data source: com.databricks.spark.csv

What's the issue?

like image 436
kamalbanga Avatar asked Jul 23 '15 19:07

kamalbanga


2 Answers

Use the dependencies accordingly. For example:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.6.1</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-sql_2.10</artifactId>
    <version>1.6.1</version>
</dependency>

<dependency>
    <groupId>com.databricks</groupId>
    <artifactId>spark-csv_2.10</artifactId>
    <version>1.4.0</version>
</dependency>
like image 81
Thilina Piyadasun Avatar answered Oct 20 '22 00:10

Thilina Piyadasun


Include the option: --packages com.databricks:spark-csv_2.10:1.2.0 but do it after --class and before the target/

like image 30
claudiaann1 Avatar answered Oct 19 '22 23:10

claudiaann1