DynamoDB: Keys and what they mean

Tags:

amazon-dynamodb

I'm confused as to how to use DynamoDB table keys. The documentation mentions HASH (which seem to also be referred to as Partition) keys and RANGE (or SORT?) keys. I'm trying to roughly align these with my previous understanding of database indexing theories.

My current, mostly guess-based understanding is that a HASH key is essentially a primary key - it must be unique and is automatically indexed for fast-reading - and a RANGE key is basically something you should apply to any other field you plan on querying on (either in a WHERE-like or sorting context).

This is then somewhat confused by the introductions of Local and Global Secondary Indexes. How do they play into things?

If anyone could nudge me in the right direction, bearing in mind my current, probably flawed understanding has come from the docs, I'd be super grateful.

Thanks!

514

asked Sep 04 '17 12:09

user1381745

1 Answers

Basically, the DynamoDB table is partitioned based on partition key (otherwise called hash key).

1) If the table has only partition key, then it has to be unique. The DynamoDB table performance based pretty much on the partition key. The good partition key should be a well scattered value (should not have a sequence number as partition key like RDBMS primary key in legacy systems).

2) If the table has both partition key and sort key (otherwise called RANGE key), then the combination of them needs to be unique. It is a kind of concatenation key in RDBMS terms.

However, the usage differs in DynamoDB table. DynamoDB doesn't have a sorting functionality (i.e. ORDER BY clause) across the partition keys. For example, if you have 10 items with same partition key value and different sort key values, then you can sort the result based on the sort key attribute. You can't apply sorting on any other attributes including partition key.

All sort key values of a partition key will be maintained in the same partition for better performance (i.e. physically co-located).

LSI - There can be only one LSI for the table. It should be defined when you create the table. This is kind of alternate sort key for the table

GSI - In order to understand GSI, you need to understand the difference between SCAN and QUERY API in DynamoDB.

SCAN - is used when you don't know the partition key (i.e. full table scan to get the item)

QUERY - is used when you know the partition key (i.e. sort key is optional)

As DynamoDB costing is based on read/write capacity units and for better performance, scan is not the best option for most of the use cases. So, there is an option to create the GSI with alternate partition keys based on the Query Access Pattern (QAP).

GSI Example

answered Sep 22 '22 15:09

notionquest

Related questions
                            
                                Amazon Web Services DynamoDB Multiple Partition Keys
                            
                                DynamoDB FilterExpression with NOT IN
                            
                                Query DynamoDB with IN Clause
                            
                                How to add pre-existing data from DynamoDB to Elasticsearch?
                            
                                How to Filter Nested Array Object in DynamoDB
                            
                                How do I query DynamoDB when I want to consider the sort key but not the partition key?
                            
                                Cache invalidation in serverless applications
                            
                                DynamoDB - Sort key attribute name
                            
                                Is it possible to append to multi-valued attributes in DynamoDB?
                            
                                Run DynamoDB Local with the java command on Mac OS X
                            
                                Put Items using Json File in AWS DynamoDB using AWS CLI
                            
                                How to properly connect AWS API gateway -> Lambda -> DAX -> DynamoDB?
                            
                                DynamoDB: Value provided in ExpressionAttributeNames unused in expressions: keys: {#date}
                            
                                DynamoDB restore to a new table and update Terraform
                            
                                Is DynamoDB right for my 1M events-per-day scenario where I need access to both records and summary (aggregate) information
                            
                                IAM policy to allow access to DynamoDB console for specific tables
                            
                                What is the meaning of AmazonDB Free Tier?
                            
                                DynamoDB primary key and indexes table design
                            
                                Is it possible to choose what should be the field to be return in DynamoDB?
                            
                                Append to or create StringSet if it doesn't exist

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With