Having some databases and tables in them in Hive instance. I'd like to show tables for some specific database (let's say 3_db).
+------------------+--+
| database_name |
+------------------+--+
| 1_db |
| 2_db |
| 3_db |
+------------------+--+
If I enter beeline from bash-nothing complex there, I just do the following:
show databases;
show tables from 3_db;
When I'm using pyspark via ipython notebeook- my cheap tricks are not working there and give me error on the second line (show tables from 3_db) instead:
sqlContext.sql('show databases').show()
sqlContext.sql('show tables from 3_db').show()
What seems to be wrong and why's the same code works in one place and don't work in another?
sqlContext.sql("show tables in 3_db").show()
Another possibility is to use the Catalog methods:
spark = SparkSession.builder.getOrCreate()
spark.catalog.listTables("3_db")
Just be aware that in PySpark this method returns a list
and in Scala, it returns a DataFrame
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With