Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are consumed read capacity units calculated in DynamoDB query

I've seen the page on amazon and understand that 1 RCU is a 4KB item.

If I have a table with 50 items, I've read that a scan will read the full 50 items and use 50 RCU. But lets say I did a query, my table is 10 by 5, will it still use 50 RCU?

like image 701
zuba Avatar asked May 04 '18 15:05

zuba


People also ask

How is DynamoDB read capacity calculated?

1 read capacity unit (RCU) = 1 strongly consistent read of up to 4 KB/s = 2 eventually consistent reads of up to 4 KB/s per read. 2 RCUs = 1 transactional read request (one read per second) for items up to 4 KB. For reads on items greater than 4 KB, total number of reads required = (total item size / 4 KB) rounded up.

How does DynamoDB determine read and write capacity?

If you need to write an item that is larger than 1 KB, DynamoDB must consume additional write capacity units. Transactional write requests require 2 write capacity units to perform one write per second for items up to 1 KB. The total number of write capacity units required depends on the item size.

What is consumed capacity in DynamoDB?

PDF. The capacity units consumed by an operation. The data returned includes the total provisioned throughput consumed, along with statistics for the table and any indexes involved in the operation. ConsumedCapacity is only returned if the request asked for it.

Which of the following describes how DynamoDB read operations consume read capacity units?

The following describes how DynamoDB read operations consume read capacity units: GetItem —Reads a single item from a table. To determine the number of capacity units that GetItem will consume, take the item size and round it up to the next 4 KB boundary.


3 Answers

Scanning a table that contains 50 items will consume 50 RCU only if the total size of the 50 items combined equal 200KB (for a strongly consistent read, or 400KB for an eventual consistent read). Most items are not that big, so a 50 items typically only require about 10KB to store meaning a full scan for a table of 50 items, with eventual consistency, would only cost about 3 RCU.

The consumed Read Capacity Units (RCU) depends on multiple factors:

  • the operation (ie. Get vs. Query/Scan)
  • the size of the items
  • whether the read is strongly consistent or eventually consistent

If an item is read using a GetItem operation than the consumed capacity is billed in increments of 4KB, based on the size of the item (ie. a 200B item and a 3KB item would each consume 1RCU, while a 5KB item would consume 2 RCU)

If you read multiple items using a Query or Scan operation, then the capacity consumed depends on the cumulative size of items being accessed (you get billed even for items filtered out of a query or scan when using filters). So, if your query or scan accesses 10 items, that are approximately 200 bytes each in size, then it will consume only 1 RCU. If you read 10 items but each item is about 5KB in size, then the total consumed capacity will be 13 RCU (50KB / 4KB = 12.5, rounded up, is 13)

What's more, if you perform an eventual consistent read, then you can double the size per capacity unit. So it would only cost 7 RCU to read the 10 5KB items.

You can read more about throughput capacity here.

A couple of things to note:

  • a single item may be as large as 400KB, so reading an item could consume as much as 100 RCU.
  • when calculating item size, attribute names count towards the item size as well, not just their values!
like image 84
Mike Dinescu Avatar answered Oct 24 '22 00:10

Mike Dinescu


Query—Reads multiple items that have the same partition key value. All items returned are treated as a single read operation, where DynamoDB computes the total size of all items and then rounds up to the next 4 KB boundary. For example, suppose your query returns 10 items whose combined size is 40.8 KB. DynamoDB rounds the item size for the operation to 44 KB. If a query returns 1500 items of 64 bytes each, the cumulative size is 96 KB.

Ref: https://docs.amazonaws.cn/en_us/amazondynamodb/latest/developerguide/ProvisionedThroughput.html

like image 39
rajd Avatar answered Oct 24 '22 02:10

rajd


Smoke tested this with following entries using composite primary key & provisioned capacity, and eventual consistency in place:

  • entry#1 (size ~ 200B): hash key = foo, range key = foobar

  • entry#2 (size ~ 5KB): hash key = foo, range key = foojar

Queries to the table & reported consumption of RCUs:

  1. hash key EQUALS "foo" AND range key BEGINS_WITH "foo" --> both entries returned and 1 consumed RCUs
  2. hash key EQUALS "foo" AND range key BEGINS_WITH "foobar" --> entry with size ~ 200B returned and 0.5 consumed RCUs
  3. hash key EQUALS "foo" AND range key BEGINS_WITH "fooojar" --> entry with size ~ 5KB returned and 1 consumed RCUs

As already being speculated, this would indicate, that the accessed items are those matching the whole composite key, not just the hash key.

Compared, if you just queried the items via hash key, and then filtered to down to single item --> it would access all items in the partition and still consume the 1 RCU.

like image 2
L3p1 Avatar answered Oct 24 '22 02:10

L3p1