Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dynamodb reading and writing units

I've been reading various articles on the Amazon DynamoDB but I'm still a little confused on the reading/writing units on how these are used. For example, using the free version, I have 5 writing units and 10 reading units available per second, each unit representing 1kb of data. But what does this really mean?

Does this mean max 10 read requests can be performed per seconds or max 10kb of data can be requested per seconds(regardless of whether there are 10 or 100 requests)? Because this aspect is not clear for me. So if I have 20 users who concurrently access a page on my website(which result in 20 queries being performed to retrieve data), what will happen? Will 10 of them see the the data immediately while the other 10 will see it after 1 second? Or will they all see the data immediately if the data requested (multiplied by 20) is less then 10kb?

Also, if the reading units are not enough, and 100 users request concurrently 1kb of data each, does this mean all the requests will require 10 seconds to complete??

Also, the pricing is a little confusing as I don't understand if the prices are paid for units reserved or consumed? So for example they say the price is "Write Throughput: $0.00735 per hour for every 10 units of Write Capacity". Does this mean one will pay ($0.00735*24=$0.176) even if no writing requests are made during a day?

like image 753
Biggie Mac Avatar asked Jan 08 '14 15:01

Biggie Mac


People also ask

How many writes per second DynamoDB?

DynamoDB charges one WCU for each write per second (up to 1 KB) and two WCUs for each transactional write per second.

How many writes can DynamoDB handle?

First, there are some limits on how high DynamoDB On-Demand can scale up. By default, that limit is 40,000 read request units and 40,000 write request units per table in most regions. You can increase that if needed.

How do you improve read and writes for DynamoDB?

To Summarize: You can increase your DynamoDB throughput by several times, by parallelizing reads/writes over multiple partitions. Use DynamoDB as an attribute store rather than as a document store. This will not only reduce the read/write costs but also improve the performance of your operations considerably.


3 Answers

You are correct in that the capacity is tightly bound to the size of the objects being read/written.

Feb 2016 Updates

AWS has updated how they calculate throughput, and the they've increased from 1 KB objects to 4 KB for their calculations. The discussion below is still valid, but certain calculations are different now.

Always consult the latest DynamoDB documentation for the latest information and examples on how to calculate throughput.

Older Documentation

From the AWS DynamoDB documentation (as of 1/8/14):

Units of Capacity required for writes = Number of item writes per second x item size (rounded up to the nearest KB)

Units of Capacity required for reads* = Number of item reads per second x item size (rounded up to the nearest KB)

  • If you use eventually consistent reads you’ll get twice the throughput in terms of reads per second.

Per your example question, if you want to read 10KB of data per second you'll need 10 Read Units provisioned. It doesn't matter if you make 10 requests for 1 KB of data or if you make a single request for 10 KB of data. You're limited to 10KB/second.

Note that the required number of units of Read Capacity is determined by the number of items being read per second, not the number of API calls. For example, if you need to read 500 items per second from your table, and if your items are 1KB or less, then you need 500 units of Read Capacity. It doesn’t matter if you do 500 individual GetItem calls or 50 BatchGetItem calls that each return 10 items.

For your 20 user example, keep in mind that data is rounded up to the nearest KB. So even if your 20 users request 0.5 KB of data, you'll need 20 Read Units to service all of them at once. If you only have 10 read units, then the other 10 requests will be throttled. If you use the Amazon DynamoDB libraries, they have auto-retry logic baked in to try the request again so they should eventually get serviced.

For your question about 100 users, some of those requests may simply be throttled and the retry logic may eventually fail (the code will only retry the request so many times before it stops trying) - so you need to be ready to handle those 400 response codes from DynamoDB and react accordingly. It's very important to monitor your application when you use DynamoDB and ensure you aren't going to be throttled on app critical transactions.

Your last question about pricing - you pay hourly for what you reserve. If you reserve 1000 Read Units and your site has absolutely no traffic, then too bad, you'll still pay hourly for those 1000 Read Units.

For completeness - keep in mind that throughput is provision PER TABLE. So if you have 3 DynamoDB tables: Users, Photos, Friends then you have to provision capacity for each table, and you need to determine what is appropriate for each table. In this trivial example, perhaps Photos is accessed less frequently in your app so you can provision lower throughput compared to your Users table.

Eventually consistent reads are great for cost saving but your app has to be designed to handle it. An eventually consistent read means that if you update data and immediately try to read the new value, you may not get the new value back, it may still return the previous value. Eventually, with enough time, you'll get the new value. You pay less since you aren't guaranteed to read the latest data - but that can be OK if you design appropriately.

like image 127
Mike Pugh Avatar answered Oct 25 '22 14:10

Mike Pugh


Think of it as a pipe diameter : you pay for a possible data throughput per second. The number of requests isn't relevant.

Besides, if you ask for 10 read units, then you will indeed pay for 10 units, regardless of your actual traffic.

If your traffic were to raise above the limit, you would first get a warning ( let's say at 80% of your provisionned troughput). Then the requests begin to take more time. If you are still above the limit for a significant amount of time, new connections can be refused for a few minutes.

Hope that helps

like image 21
aherve Avatar answered Oct 25 '22 15:10

aherve


• Adding and updating items consume your write throughput and requesting/querying items consume your read throughput in dynamo db. The maximum size for a single item in a DynamoDB table is 400 kb, the bigger your items are, the more throughput you consume and more your cost will be. If you are searching in DynamoDB using key then table scan will not happen and you need throughput equivalent to your item size, for example if your item size is 4kb then you need 1 read capacity units(1 unit is equivalent to 4KB/seconds), if you want to read 40KB of data per second you'll need 10 Read Units provisioned. It doesn't matter if you make 10 requests for 4 KB of data or if you make a single request for 40 KB of data. You're limited to 40KB/second. But if you are searching apart from key then DynamoDB scans the complete data from table, while scanning db will cross provisioned throughput limit when data is high in database, We can increase the throughput of table to maximum value needed while scanning but that will increase the cost and will make out database sitting completely idle most of the time.

like image 24
ABHAY JOHRI Avatar answered Oct 25 '22 15:10

ABHAY JOHRI