Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to design key schema to have only one DynamoDB table per application?

According to DynamoDB doc: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-general-nosql-design.html

"You should maintain as few tables as possible in a DynamoDB application. Most well designed applications require only one table."

But according to my experience you always have to do the opposite thing due to partition key design.

Let's consider the next situation. We have several user roles, for example, "admin", "manager", "worker". Usual workflow of an admin is to CRUD manager data, where read operation is to get not one manager but all manager list. The same is for the manager - he CRUDs worker data. We have only two scenarios of key usage for both cases:

  • get a list of all items (item key doesn't matter)
  • work with a particular item using its full key.

Naturally we should have uniformly distributed partition key (as the doc emphasises) so we can't select user role for it and should use user id. Since we already have as partition key some random id, we don't need sort key at all since it simply doesn't work - we already access exectly one user by only using the partition key part. At this point we realize that user id is working like a charm for CUD operations but for every R operation we need to scan all the table and then filter the result by user role which is ineffective. How can this be improved? Very naturally - let's just have own table for each user type! Then we will scan for manager list from admin API and for worker list from the manager one.

I use DynamoDB almost for a year and still can't get it. For me the reality is that for real life scenarios sort key is something that you can never use (the only real case for it I had was to access items like "agreements" that belong to the two users of different types the same time, so the primary key was { partion: "managerId", sort: "userId" } and secondary global index was { partition: "userId", sort: "managerId" } so I could effectively query for all particualar manager agreement list or all particular user agreement list providing only corresponding manger or user id for the query. The approach is discussed in doc here: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-adjacency-graphs.html).

I feel that I don't understand the concept at all. What can be an effective way of key schema for provided example to use only one DynamoDB table for both user types?

like image 951
Arsenii Fomin Avatar asked Sep 11 '18 14:09

Arsenii Fomin


People also ask

What is single table DynamoDB?

Amazon DynamoDB - single table design. DynamoDB is a fully-managed NoSQL key-value database which delivers single-digit performance at any scale. However, to achieve this kind of performance, for non-trivial use cases, with huge scale and traffic you need to model your data carefully.

What is a single table database?

Having a single-table design means you are putting all your data into the same table, and DynamoDB is not going to complain about that because it's schema-less key-value pair storage.

Does DynamoDB need schema?

A relational database management system (RDBMS) requires you to define the table's schema when you create it. In contrast, DynamoDB tables are schemaless—other than the primary key, you do not need to define any extra attributes or data types when you create a table.

Can DynamoDB table have more than one sort key?

4. How many sort keys can DynamoDB have? There should only be one sort key defined per table. But, it can be composed using multiple columns.


1 Answers

It sounds like what you need in this case is a Global Secondary Index (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html) where the partition key is the user role. That way, you can query all users with a particular role through that UserRoleIndex and, with the help of a sort key on the user ID, single out one particular user within that role.

Alternatively, if you are starting from scratch with a new table, you might not even need an index (unless you don't know the role of a user when you delete them). You can use a "composite primary key" (https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.CoreComponents.html#HowItWorks.CoreComponents.PrimaryKey) where the partition key and the sort key would be the same as in the index I am suggesting above.

Using the same notation that you used in your question, I would recommend { partition: "userRole", sort: "userId" }.

DynamoDB can be hard to understand sometimes and there definitively are cases where a traditional SQL database makes more sense. This video from AWS re:Invent 2018 is great to understand the difference between the two: https://www.youtube.com/watch?v=HaEPXoXVf2k&feature=youtu.be.

In your case, though, it looks like you have a very clear access pattern, so DDB would work for you.

like image 138
Yves Gurcan Avatar answered Oct 12 '22 23:10

Yves Gurcan