Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Overwrite MySQL tables with AWS Glue

I have a lambda process which occasionally polls an API for recent data. This data has unique keys, and I'd like to use Glue to update the table in MySQL. Is there an option to overwrite data using this key? (Similar to Spark's mode=overwrite). If not - might I be able to truncate the table in Glue before inserting all new data?

Thanks

like image 454
JoeC Avatar asked Nov 29 '17 15:11

JoeC


1 Answers

I found a simpler way working with JDBC connections in Glue. The way the Glue team recommends to truncate a table is via following sample code when you're writing data to your Redshift cluster:

datasink5 = glueContext.write_dynamic_frame.from_jdbc_conf(frame = resolvechoice4, catalog_connection = "<connection-name>", connection_options = {"dbtable": "<target-table>", "database": "testdb", "preactions":"TRUNCATE TABLE <table-name>"}, redshift_tmp_dir = args["TempDir"], transformation_ctx = "datasink5")

where

connection-name your Glue connection name to your Redshift Cluster
target-table    the table you're loading the data in 
testdb          name of the database 
table-name      name of the table to truncate (ideally the table you're loading into)
like image 108
Rohan Kumar Avatar answered Sep 21 '22 08:09

Rohan Kumar