Is there any way to run local master Spark SQL queries against AWS Glue?
Launch this code on my local PC:
SparkSession.builder()
.master("local")
.enableHiveSupport()
.config("hive.metastore.client.factory.class", "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory")
.getOrCreate()
.sql("show databases"); // this query isn't running against AWS Glue
EDIT
based on some examples it appears that the hive.metastore.uris
configuration key should allow specifying a specific metastore url, however, it's not clear how to get the relevant value for glue
SparkSession.builder()
.master("local")
.enableHiveSupport()
.config("hive.metastore.client.factory.class", "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory")
.config("hive.metastore.uris", "thrift://???:9083")
.getOrCreate()
.sql("show databases"); // this query isn't running against AWS Glue
Amazon provide this client that should solve the problem. (didn't try it yet)
https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive-metastore
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With