Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write more than 25 items/rows into Table for DynamoDB?

I am quite new to Amazon DynamoDB. I currently have 20000 rows that I need to add to a table. However, based on what I've read, it seems that I can only write up to 25 rows at a time using BatchWriteItem class with 25 WriteRequests. Is it possible to increase this? How can I write more than 25 rows at a time? It is currently taking about 15 minutes to write all 20000 rows. Thank you.

like image 727
code Avatar asked Jun 26 '15 05:06

code


People also ask

How do you increase the capacity of the table in DynamoDB?

With DynamoDB auto scaling, a table or a global secondary index can increase its provisioned read and write capacity to handle sudden increases in traffic, without request throttling. When the workload decreases, DynamoDB auto scaling can decrease the throughput so that you don't pay for unused provisioned capacity.

How can you increase your DynamoDB table limit in a region?

You can use the Service Quotas console , the AWS API and the AWS CLI to check the global secondary indexes per table default and current quotas that apply for your account, and to request quota increases, when needed. You can also request quota increases by cutting a ticket to https://aws.amazon.com/support .

What is maximum limit for the size of an item collection in DynamoDB?

The maximum size of a DynamoDB item is 400KB. From the Limits in DynamoDB documentation: The maximum item size in DynamoDB is 400 KB, which includes both attribute name binary length (UTF-8 length) and attribute value lengths (again binary length). The attribute name counts towards the size limit.


1 Answers

You can only send up to 25 items in a single BatchWriteItem request, but you can send as many BatchWriteItem requests as you want at one time. Assuming you've provisioned enough write throughput, you should be able to speed things up significantly by splitting those 20k rows between multiple threads/processes/hosts and pushing them to the database in parallel.

It's maybe a bit heavyweight for that small of a dataset, but you can use AWS Data Pipeline to ingest data from S3. It basically automates the process of creating a Hadoop cluster to suck down your data from S3 and send it to DynamoDB in a bunch of parallel BatchWriteItem requests.

like image 112
David Murray Avatar answered Nov 15 '22 21:11

David Murray