Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why using a common hash key with AWS DynamoDB is a bad thing?

I need to have a way to have items ordered by timestamp, so I am considering using a common hash key and unix timestamp as the range key.

According to the FAQ:

When storing data, Amazon DynamoDB divides a table into multiple partitions and 
distributes the data based on the hash key element of the primary key. The provisioned 
throughput associated with a table is also divided among the partitions; each 
partition's throughput is managed independently based on the quota allotted to it. 
There is no sharing of provisioned throughput across partitions. 

As I am using a common hash key, then there will be no uneven load distribution - since all the load will goes into a single partition.

So when I provision 100 write to this partition, all the capacity will be used, then I suppose it is a good thing as capacity is not being wasted?

like image 918
Ryan Avatar asked Mar 20 '13 14:03

Ryan


People also ask

Does DynamoDB primary key need to be unique?

The primary key uniquely identifies each item in the table, so that no two items can have the same key. DynamoDB supports two different kinds of primary keys: Partition key – A simple primary key, composed of one attribute known as the partition key.

Is DynamoDB hash key unique?

DynamoDB simple key consists only of one value - the partition/hash key. Each record needs to have one of these, and it needs to be unique. With simple key, DynamoDB essentially works just like a Key-Value store.

What is hot key problem in DynamoDB?

One of the most common issues you face when using DynamoDB, or any similar “Big Data” type database, is that it doesn't access your data in a very uniform pattern. It's a kind of issue commonly known as a hotspot or a hotkey.


1 Answers

You provision writes and reads to a DynamoDB table, not a partition. Your capacity is spread/shared across the partitions, but each partition also has a fixed rate limit because of the underlying hardware.

By using a single hash key, you will have a fixed limit on how many reads and writes you can actually perform on the table, regardless of how many you are provisioning and paying for.

You can't scale it above that limit as dynamodb can't further partition your table to parallelize load processing, one of the primary ways AWS scales the system as your provision numbers increase.

It's possible you won't hit that limit at first, but Amazon recommends against this approach because Amazon wants you to use AWS in ways that will scale.

like image 153
Eric Hammond Avatar answered Oct 20 '22 03:10

Eric Hammond