Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sparklyr can't see databases created in Hive and vice versa

I installed Apache Hive in local and I was trying to read tables via Rstudio/sparklyr.

I created a database using Hive:

hive> CREATE DATABASE test;

and I was trying to read that database using the following R script:

library(sparklyr)
library(dplyr)
library(DBI)

spark_disconnect_all()

Sys.setenv(SPARK_HOME = "/home/alessandro/spark-2.1.0-bin-hadoop2.7")

config <- spark_config()
config$spark.executor.instances <- 4
config$spark.executor.cores <- 4
config$spark.executor.memory <- "4G"
config$spark.sql.hive.metastore <- "/home/alessandro/spark-warehouse"
config$hive.metastore.warehouse.dir <- "/home/alessandro/spark-warehouse"
sc <- spark_connect(master="local", config=config, version="2.1.0")

dbGetQuery(sc, "show databases")

Still, dbGetQuery(sc, "show databases") does not show the database created, and that's awkward since the database folder test.db is correctly placed in the specified hive.metastore.warehouse.dir.

Similarly, if I create a database using dbGetQuery(sc, "CREATE DATABASE test2"), a database folder is created in hive.metastore.warehouse.dir, but I can't see it via Hive using:

hive> show databases;

Basically, even if all the database folders are placed in the correct path, from Hive I can see only databases created via Hive, whereas from R I can see only databases created via R.

like image 836
stochazesthai Avatar asked Nov 02 '25 01:11

stochazesthai


1 Answers

I solved the issue adding in the file hive-site.xml the configuration to connect with hive:

   <property>
    <name>hive.metastore.uris</name>
    <value>thrift://localhost:9083</value>
  </property>
like image 117
CVec Avatar answered Nov 03 '25 16:11

CVec



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!