Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

spark-sql Table or view not found error

I'm trying to run a basic java program using spark-sql & JDBC. I'm running into the following error. Not sure what's wrong here. Most of the material I have read does not talk on what needs to be done to fix this problem.

It will also be great if someone can point me to some good material to read on Spark-sql (Spark-2.1.1). I'm planning to use spark to implement ETL's, connecting to MySQL and other datasources.

Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found: myschema.mytable; line 1 pos 21;

    String MYSQL_CONNECTION_URL = "jdbc:mysql://localhost:3306/myschema";
    String MYSQL_USERNAME = "root";
    String MYSQL_PWD = "root";

    Properties connectionProperties = new Properties();
    connectionProperties.put("user", MYSQL_USERNAME);
    connectionProperties.put("password", MYSQL_PWD);

    Dataset<Row> jdbcDF2 = spark.read()
              .jdbc(MYSQL_CONNECTION_URL, "myschema.mytable", connectionProperties);
    spark.sql("SELECT COUNT(*) FROM myschema.mytable").show();
like image 248
user3616977 Avatar asked Jun 09 '17 14:06

user3616977


1 Answers

It's because Spark is not registering any tables from any schemas from connection by default in Spark SQL Context. You must register it by yourself:

jdbcDF2.createOrReplaceTempView("mytable");
spark.sql("select count(*) from mytable");

Your jdbcDF2 has a source in myschema.mytable from MySQL and will load data from this table on some action.

Remember that MySQL table is not the same as Spark table or view. You are telling Spark to read data from MySQL, but you must register this DataFrame or Dataset as table or view in current Spark SQL Context or Spark Session

like image 78
T. Gawęda Avatar answered Nov 09 '22 08:11

T. Gawęda