Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark - Exception in thread "main" java.sql.SQLException: No suitable driver

SOLVED: prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver") this line must be added to the connection properties.

I'm trying to lunch a spark job in local. I created a jar with dependencies by maven.

This is my pom.xml

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.agildata</groupId>
    <artifactId>spark-rdd-dataframe-dataset</artifactId>
    <packaging>jar</packaging>
    <version>1.0</version>

    <properties>    
        <exec-maven-plugin.version>1.4.0</exec-maven-plugin.version>
        <spark.version>1.6.0</spark.version>
    </properties>

    <dependencies>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_2.11</artifactId>
            <version>${spark.version}</version>
        </dependency>

        <dependency>
            <groupId>com.oracle</groupId>
            <artifactId>ojdbc7</artifactId>
            <version>12.1.0.2</version>
        </dependency>





    </dependencies>

    <build>
        <plugins>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.2</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>

            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <executions>
                    <execution>
                        <id>scala-compile-first</id>
                        <phase>process-resources</phase>
                        <goals>
                            <goal>add-source</goal>
                            <goal>compile</goal>
                        </goals>
                    </execution>
                    <execution>
                        <id>scala-test-compile</id>
                        <phase>process-test-resources</phase>
                        <goals>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-assembly-plugin</artifactId>
                <version>2.4.1</version>
                <configuration>
                    <!-- get all project dependencies -->
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                    <!-- MainClass in mainfest make a executable jar -->
                    <archive>
                        <manifest>
                            <mainClass>example.dataframe.ScalaDataFrameExample</mainClass>
                        </manifest>
                    </archive>

                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <!-- bind to the packaging phase -->
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>


        </plugins>
    </build>

</project>

I run the mvn package command and the build is succesfull. After i try to run the job like this way: GMAC:bin gabor_dev$ sh spark-submit --class example.dataframe.ScalaDataFrameExample --master spark://QGMAC.local:7077 /Users/gabor_dev/IdeaProjects/dataframe/target/spark-rdd-dataframe-dataset-1.0-jar-with-dependencies.jar but it throws this: Exception in thread "main" java.sql.SQLException: No suitable driver

Full error message:

16/07/08 13:09:22 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.sql.SQLException: No suitable driver
    at java.sql.DriverManager.getDriver(DriverManager.java:315)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:50)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:49)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:120)
    at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:91)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:222)
    at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:146)
    at example.dataframe.ScalaDataFrameExample$.main(ScalaDataFrameExample.scala:30)
    at example.dataframe.ScalaDataFrameExample.main(ScalaDataFrameExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/07/08 13:09:22 INFO SparkContext: Invoking stop() from shutdown hook

Interest thing that if i build this way inside the IntelliJ IDEA nested console: mvn package exec:java -Dexec.mainClass=example.dataframe.ScalaDataFrameExample it's running, and there is no error.

This is the relevant scala code part:

val sc = new SparkContext(conf)

    val sqlContext = new SQLContext(sc)

    val url="jdbc:oracle:thin:@xxx.xxx.xx:1526:SIDNAME"

    val prop = new java.util.Properties

      prop.setProperty("user" , "usertst")
      prop.setProperty("password" , "usertst")

      val people = sqlContext.read.jdbc(url,"table_name",prop)

      people.show()

I checked my jar file, and it containst all dependencies. Can anybody help me how to solve this. Thank you!

like image 291
solarenqu Avatar asked Dec 07 '22 21:12

solarenqu


2 Answers

Here is how you will connect to postgresql using spark.

SparkSession sparkSession = SparkSession.builder().
            appName("dky").
            master("local[*]").
            getOrCreate();

    Logger.getLogger("org.apache").setLevel(Level.WARN);

    Properties properties = new Properties();
    properties.put("user", "your user name");
    properties.put("password", "your password");

    Dataset<Row> jdbcDF = sparkSession.read().option("driver", "org.postgresql.Driver")
            .jdbc("jdbc:postgresql://localhost:5432/postgres", "your table name along with schema name", properties);
    jdbcDF.show();
like image 41
Dipesh Yadav Avatar answered Jan 13 '23 03:01

Dipesh Yadav


So, the missing driver is the JDBC one and you have to add it to the SparkSQL configuration. You either do it in the application submit, as specified by this answer, or you do it through your Properties object, as you did, with this line:

prop.setProperty("driver", "oracle.jdbc.driver.OracleDriver") 
like image 56
Vale Avatar answered Jan 13 '23 03:01

Vale