I'm currently facing some questions regarding my database design. Currently i'm developing an api which lets users do the following:
Calling the API methods triggers AWS Lambda to perform the requested operations in the DynamoDB tables.
My current plan looks like this:
It should be possible to query items by specifying a time frame and the Profile ID. But i think my design completely defeats the purpose of DynamoDB. AWS documentation says that a well designed product only requires one table.
If you have multiple services in your application, each of them should have their own DynamoDB table. You should think of a DynamoDB table similar to an RDBMS instance—everywhere you would have a separate RDBMS instance, you should have a separate DynamoDB table.
As a general rule, you should maintain as few tables as possible in a DynamoDB application. To better understand why that is (keeping few tables, ideally only one) and why it might be beneficial, let's briefly review the DynamoDB data model.
The main schema difference you will see between single and multi-table models is that single-table will have generically named attributes that are used to form the table's partition and sort key. This is required because different entity types will likely have differently named primary key fields.
DynamoDB increased the default quota for the number of DynamoDB tables you can create and manage per AWS account and AWS Region from 256 to 2,500 tables. DynamoDB also increased the number of table management operations you can perform concurrently from 50 to 500.
I’m going to give this answer assuming that you need to be able to do the following queries.
One of the beauties of DynamoDB (and also a bane, perhaps) is that it is mostly schema-less. You need to have the mandatory Primary Key attributes for every item in the table, but all of the other attributes can be anything you like. In order to have a DynamoDB design with only one table, you usually need to get used to the idea of having mixed types of objects in the same table.
That being said, here’s a possible schema for your use case. My suggestion assumes that you are using something like UUIDs for your identifiers.
The partition key is a field that is simply called pkey
(or whatever you want). We’ll also call the sort key skey
(but again, it doesn’t really matter). Now, for an Account, the value of pkey
is Account-{{uuid}}
and the value of skey
would be the same. For a Profile, the pkey
value is also Account-{{uuid}}
, but the skey
value is Profile-{{uuid}}
. Finally, for an Item, the pkey
is Profile-{{uuid}}
and the skey
is Item-{{type}}-{{uuid}}
. For all of the attributes of an item, don’t worry about it, just use whatever attributes you want to use.
Since the “parent” object is always the partition key, you can get any of the “child” objects simply by querying for the ID of the of the parent. For example, your key condition expression to get all the ‘ItemType2’s for a Profile would be
pkey = “Profile-{{uuid}}” AND begins_with(skey, “Item-Type2”)
In this schema, your GSI has the same keys as the table, but reversed. You can query the GSI for ‘Item-{{type}}-{{uuid}}’ to get the owning Profile, and similarly with a Profile is to get the owning account.
What I have illustrated here is the adjacency list pattern. DynamoDB also has an article describing how to use composite sort keys for hierarchical data, which would also be suitable for your data, and depending on your expected queries, it might be more suitable than using the adjacency list.
You don’t have to put everything in a single table. Yes, DynamoDB recommends it, but it is far more important to make sure that your application is correct and maintainable. If having multiple tables means it’s easier to write a defect free application, then use multiple tables.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With