Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How set name for crawled table?

AWS crawler has prefix property for adding new tables. So If I leave prefix empty and start crawler to s3://my-bucket/some-table-backup it creates table with name some-table-backup. Is there a way to rename it to my-awesome-table and keep crawler updating renamed table? Or set up crawler to create new table with provided name?

like image 456
Cherry Avatar asked Jan 18 '18 13:01

Cherry


2 Answers

It's not possible to set up the crawler to do this, but it is very fast to create a new table that is the same as the table created by the crawler in every way, except the name. In Python:

import boto3

database_name = "database"
table_name = "prefix-dir_name"
new_table_name = "more_awesome_name"
    
client = boto3.client("glue")
response = client.get_table(DatabaseName=database_name, Name=table_name)
table_input = response["Table"]
table_input["Name"] = new_table_name

# Delete keys that cause create_table to fail
table_input.pop("CreatedBy")
table_input.pop("CreateTime")
table_input.pop("UpdateTime")
table_input.pop("DatabaseName")
table_input.pop("IsRegisteredWithLakeFormation")
catalog_id = table_input.pop("CatalogId")
client.create_table(
 DatabaseName=database_name, 
 TableInput=table_input, 
 CatalogId=catalog_id
)
like image 167
Dan Hook Avatar answered Oct 10 '22 00:10

Dan Hook


Encountered the same issue. Needed to drop more attributes than in Dan Hook's answer before the table could be queried in Redshift.

table_input="$(aws glue --region us-west-2 get-table --database-name database --name old_table --query 'Table' | jq '{Name: "new_table", StorageDescriptor, TableType, Parameters}')"

aws glue create-table --region us-west-2 --database-name database --table-input "$table_input"
aws glue delete-table --region us-west-2 --database-name database --name "old_table"
like image 42
dbaumann Avatar answered Oct 10 '22 02:10

dbaumann