Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Failed to load class for data source: com.databricks.spark.csv



My build.sbt file has this:

scalaVersion := "2.10.3"
libraryDependencies += "com.databricks" % "spark-csv_2.10" % "1.1.0"

I am running Spark in standalone cluster mode and my SparkConf is SparkConf().setMaster("spark://ec2-[ip].compute-1.amazonaws.com:7077").setAppName("Simple Application") (I am not using the method setJars, not sure whether I need it).

I package the jar using the command sbt package. Command I use to run the application is ./bin/spark-submit --master spark://ec2-[ip].compute-1.amazonaws.com:7077 --class "[classname]" target/scala-2.10/[jarname]_2.10-1.0.jar.

On running this, I get this error:

java.lang.RuntimeException: Failed to load class for data source: com.databricks.spark.csv

What's the issue?

like image 436
kamalbanga Avatar asked Jul 23 '15 19:07


2 Answers

Use the dependencies accordingly. For example:



like image 81
Thilina Piyadasun Avatar answered Oct 20 '22 00:10

Thilina Piyadasun

Include the option: --packages com.databricks:spark-csv_2.10:1.2.0 but do it after --class and before the target/

like image 30
claudiaann1 Avatar answered Oct 19 '22 23:10
