Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark SQL - PostgreSQL JDBC Classpath Issues

I’m having an issue connecting Spark SQL to a PostgreSQL data source. I’ve downloaded the Postgres JDBC jar and included it in an uber jar using sbt-assembly.

My (failing) source code: https://gist.github.com/geowa4/a9bc238ca7c372b95267.

I’ve also tried using sqlContext.jdbc() preceded with classOf[org.postgresql.Driver] as well. It appears the driver can access the Driver just fine.

Any help would be much appreciated. Thanks.

SimpleApp.scala:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext(sc)
    import sqlContext.implicits._
    val commits = sqlContext.load("jdbc", Map(
      "url" -> "jdbc:postgresql://192.168.59.103:5432/postgres",
      "dbtable" -> "commits",
      "driver" -> "org.postgresql.Driver"))
    commits.select("message").show(1)
  }
}

simple.sbt:

name := "simple-project"

version := "1.0"

scalaVersion := "2.11.6"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.1" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.3.1" % "provided"
libraryDependencies += "org.postgresql" % "postgresql" % "9.4-1201-jdbc41"

output (Edited):

Exception in thread "main" java.lang.ClassNotFoundException: org.postgresql.Driver
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.spark.sql.jdbc.DefaultSource.createRelation(JDBCRelation.scala:102)
        at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:219)
        at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
        at SimpleApp$.main(SimpleApp.scala:17)
        at SimpleApp.main(SimpleApp.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

EDIT: I changed the Scala version to 2.10.5 and the output changed to this. I feel like I'm making progress.

like image 246
geowa4 Avatar asked May 13 '15 17:05

geowa4


People also ask

Can Spark connect to PostgreSQL?

Start a Spark Shell and Connect to PostgreSQL Data To connect to PostgreSQL, set the Server, Port (the default port is 5432), and Database connection properties and set the User and Password you wish to use to authenticate to the server.

Is PostgreSQL JDBC compliant?

It provides a standard set of interfaces to SQL -compliant databases. PostgreSQL provides a type 4 JDBC driver. Type 4 indicates that the driver is written in Pure Java, and communicates in the database system's own network protocol.


1 Answers

There is a problem with general problem with JDBC, where the primordial classloader must know about the jar. In Spark 1.3 this can be addressed using the SPARK_CLASSPATH option as described here: https://spark.apache.org/docs/1.3.0/sql-programming-guide.html#jdbc-to-other-databases

In Spark 1.4, this should be fixed by #5782.

like image 173
Michael Armbrust Avatar answered Oct 23 '22 22:10

Michael Armbrust