Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check if table exists in hive metastore using Pyspark

I am trying to check if a table exists in hive metastore if not, create the table. And if the table exists, append data.

I have a snippet of the code below:

spark.catalog.setCurrentDatabase("db_name")
db_catalog = spark.catalog.listTables(dbName = 'table_name)
if any(table_name in row for row in db_catalog):
    add data
else:
    create table

However, I am getting an error.

>>> ValueError: Some of types cannot be determined after inferring

I am unable to resolve the value error as I get the same errors for other databases' tables created in hive metastore. Is there another way to check if table exists in hive metastore?

like image 328
Cryssie Avatar asked Aug 25 '19 13:08

Cryssie


People also ask

How do I know if a table exists in Hive?

Issue the SHOW TABLES command to see the views or tables that exist within workspace. Switch to the Hive schema and issue the SHOW TABLES command to see the Hive tables that exist. Switch to the HBase schema and issue the SHOW TABLES command to see the HBase tables that exist within the schema.


1 Answers

You can use JVM object for this.

if spark._jsparkSession.catalog().tableExists('db_name', 'tableName'):
  print("exist")
else:
  print("Not Exist")

Py4j socket used for Python functionality. Others operation uses JVM SparkContext.

In Spark Scala you can directly access.

spark.catalog.tableExists("dbName.tableName")
like image 59
Yukeshkumar Avatar answered Sep 21 '22 11:09

Yukeshkumar