I have two DynamoDB tables: Table_1 and Table_2. I am trying to deprecate Table_1 and copy information into Table_2 from Table_1, which has different GSIs and different LSIs.
Table_1 attributes are: Id, state, isReused, empty, normal
Table_2 attributes are: UserId, Status, isOld, normal
Id maps to UserId, state maps to status, normal maps to normal, empty is dropped from Table_2, and if the state is "OLD" then isOld sets to true.
What is the best way to export this data from Table_1, do the transform on the attributes/data, and then load the information back into Table_2?
Currently, I am able to use AWS Data Pipeline to import/export data from Table_1 to Table_2 with the given templates, but this does not do the transforms. I'm guessing that I need to use EMR to do the transforms.
I also use DynamoDB streams to keep the table in sync, but from my understanding, DynamoDB streams only streams updated information, not information that already exists in a table.
Assuming that you need this data movement only once, I can think of two options:
Instead of using Data Pipeline and writing EMR jobs, you can write a script to query all the items in Table_1 and do a transform in Java. After doing the transform in Java, do a conditional put [1] to only update the item in Table_2 if it doesn't exist. This will make sure that any changes that are made in Table_1 during this backfill will show the latest information in Table_2.
(http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With