Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

EMR dyanmodb export failed because of table capacity set to on-demand

After we changed the dynamodb table capacity to on-demand, the data pipeline job to export dynamodb table failed with this error.

Exception in thread "main" java.lang.RuntimeException: Read throughput should not be less than 1. Read throughput percent: 0.0
at org.apache.hadoop.dynamodb.read.AbstractDynamoDBInputFormat.getSplits(AbstractDynamoDBInputFormat.java:51)
at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:520)
at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:512)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)

Any workaround to this issue?

Thanks

--gsu

like image 797
G.Su Avatar asked Jan 07 '19 18:01

G.Su


People also ask

When using provisioned capacity mode How are you charged for Amazon DynamoDB?

When you purchase reserved capacity, you must designate an AWS Region, quantity, and term. You will be charged (1) a one-time upfront fee, and (2) an hourly fee for each hour during the term based on the amount of DynamoDB reserved capacity you purchase.

How many read capacity units will you need when configuring your DynamoDB table?

Transactional read requests require two read capacity units to perform one read per second for items up to 4 KB. If you need to read an item that is larger than 4 KB, DynamoDB must consume additional read capacity units.

How do I export data from DynamoDB table?

To export a DynamoDB table, you use the AWS Data Pipeline console to create a new pipeline. The pipeline launches an Amazon EMR cluster to perform the actual export. Amazon EMR reads the data from DynamoDB, and writes the data to an export file in an Amazon S3 bucket.

Is there a limit to how much throughput you can get out of a single table in DynamoDB?

The maximum provisioned throughput you can request is 10,000 write capacity unit and 10,000 read capacity unit for both auto scaling and manual throughput provisioning. If you want to exceed this limit then you have to contact Amazon before hand to get the access.


1 Answers

I'd contact AWS support to confirm, but I was told the EMR DynamoDB connector does not formally support tables using on-demand provisioning yet. So, more than likely you need to switch the table back to provisioned capacity as a workaround.

Edit: As of 23 January 2019, the EMR connector for DynamoDB supports tables set to on-demand billing.

like image 151
Kirk Avatar answered Sep 29 '22 21:09

Kirk