I created a Development Endpoint in the AWS Glue console and now I have access to SparkContext and SQLContext in gluepyspark console.
How can I access the catalog and list all databases and tables? The usual sqlContext.sql("show tables").show()
does not work.
What might help is the CatalogConnection Class but I have no idea in which package it is. I tried importing from awsglue.context and no success.
Glue returns back one page per response. If you have more than 100 tables, make sure you use NextToken
to retrieve all tables.
def get_glue_tables(database=None):
next_token = ""
while True:
response = glue_client.get_tables(
DatabaseName=database,
NextToken=next_token
)
for table in response.get('TableList'):
print(table.get('Name'))
next_token = response.get('NextToken')
if next_token is None:
break
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With