Change the schema of a DynamoDB table: what is the best/recommended way?

Tags:

amazon-dynamodb

What is the Amazon-recommended way of changing the schema of a large table in a production DynamoDB?

Imagine a hypothetical case where we have a table Person, with primary hash key SSN. This table may contain 10 million items.

Now the news comes that due to the critical volume of identity thefts, the government of this hypothetical country has introduced another personal identification: Unique Personal Identifier, or UPI.

We have to add an UPI column and change the schema of the Person table, so that now the primary hash key is UPI. We want to support for some time both the current system, which uses SSN and the new system, which uses UPI, thus we need both these two columns to co-exist in the Person table.

What is the Amazon-recommended way to do this schema change?

426

asked Jul 08 '15 18:07

Dimitre Novatchev

1 Answers

There are a couple of approaches, but first you must understand that you cannot change the schema of an existing table. To get a different schema, you have to create a new table. You may be able to reuse your existing table, but the result would be the same as if you created a different table.

Lazy migration to the same table, without Streams. Every time you modify an entry in the Person table, create a new item in the Person table using UPI and not SSN as the value for the hash key, and delete the old item keyed at SSN. This assumes that UPI draws from a different range of values than SSN. If SSN looks like XXX-XX-XXXX, then as long as UPI has a different number of digits than SSN, then you will never have an overlap.
Lazy migration to the same table, using Streams. When streams becomes generally available, you will be able to turn on a Stream for your Person table. Create a stream with the NEW_AND_OLD_IMAGES stream view type, and whenever you detect a change to an item that adds a UPI to an existing person in the Person table, create a Lambda function that removes the person keyed at SSN and add a person with the same attributes keyed at UPI. This approach has race conditions that can be mitigated by adding an atomic counter-version attribute to the item and conditioning the DeleteItem call on the version attribute.
Preemptive (scripted) migration to a different table, using Streams. Run a script that scans your table and adds a unique UPI to each Person-item in the Person table. Create a stream on Person table with the NEW_AND_OLD_IMAGES stream view type and subscribe a lambda function to that stream that writes all the new Persons in a new Person_UPI table when the lambda function detects that a Person with a UPI was changed or when a Person had a UPI added. Mutations on the base table usually take hundreds of milliseconds to appear in a stream as stream records, so you can do a hot failover to the new Person_UPI table in your application. Reject requests for a few seconds, point your application to the Person_UPI table during that time, and re-enable requests.

181

answered Sep 29 '22 12:09

Alexander Patrikalakis

Related questions
                            
                                How can I import bulk data from a CSV file into DynamoDB?
                            
                                Is it possible to save datetime to DynamoDB?
                            
                                DynamoDB : The provided key element does not match the schema
                            
                                ItemSize in DynamoDB
                            
                                Terraform + DynamoDB: All attributes must be indexed
                            
                                How to join tables in AWS DynamoDB?
                            
                                DynamoDB - Key element does not match the schema
                            
                                How do we query on a secondary index of dynamodb using boto3?
                            
                                Is there a way to enforce unique constraint on a property (field) other than the primary key in dynamodb
                            
                                how to return items in a dynamodb on aws-cli
                            
                                Dynamodb scan in sorted order
                            
                                DynamoDB Batch Update
                            
                                Export data from DynamoDB
                            
                                Example of update_item in dynamodb boto3
                            
                                Query condition missed key schema element : Validation Error
                            
                                How do you query DynamoDB?
                            
                                how to put an Item in aws DynamoDb using aws Lambda with python
                            
                                How do you query for a non-existent (null) attribute in DynamoDB
                            
                                How to prevent creating a new item in UpdateItem if the item does not exist
                            
                                Update attribute "timestamp" reserved word

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With