AWS DynamoDB and MapReduce in Java

Tags:

I have a huge DynamoDB table that I want to analyze to aggregate data that is stored in its attributes. The aggregated data should then be processed by a Java application. While I understand the really basic concepts behind MapReduce, I've never used it before.

In my case, let's say that I have a customerId and orderNumbers attribute in every DynamoDB item, and that I can have more than one item for the same customer. Like:

Click to copy

customerId: 1, orderNumbers: 2
customerId: 1, orderNumbers: 6
customerId: 2, orderNumbers: -1

Basically I want to sum the orderNumbers for each customerId, and then execute some operations in Java with the aggregate.

AWS Elastic MapReduce could probably help me, but I don't understand how do I connect a custom JAR with DynamoDB. My custom JAR probably needs to expose both a map and reduce functions, where can I find the right interface to implement?

Plus I'm a bit confused by the docs, it seems like I should first export my data to S3 before running my custom JAR. Is this correct?

Thanks

955

asked Apr 08 '12 23:04

Mark

1 Answers

Note: I haven't built a working EMR, just read about it.

First of all, Prerequisites for Integrating Amazon EMR with Amazon DynamoDB

You can work directly on DynamoDB: Hive Command Examples for Exporting, Importing, and Querying Data in Amazon DynamoDB, As you can see you can do "SQL-like" queries that way.

If you have zero knowledge about Hadoop you should probably read some introduction material such as: What is Hadoop

This tutorial is another good read Using Amazon Elastic MapReduce with DynamoDB

Regarding your custom JAR application, you need to upload it to S3. Use this guide: How to Create a Job Flow Using a Custom JAR

I hope this will help you get started.

answered Sep 29 '22 02:09

Chen Harel

Related questions
                            
                                Configuring JDO in Spring 3.1?
                            
                                What's the default TemporalType for a temporal map key without a @MapKeyColumn or @MapKeyTemporal annotation?
                            
                                Creating logback logger programmatically
                            
                                How to configure the Annotation Processing API without external Jar using Maven?
                            
                                Java SSL Exception - "Prime size must be a multiple of 64..." [duplicate]
                            
                                Mouse click with JNA
                            
                                What Javascript framework integrates well with Grails?
                            
                                MySQL truncates composed unique index to 64 characters
                            
                                Duplicate files in Gradle-built .war file
                            
                                Performing long running operation in onDestroy
                            
                                MediaPlayer stops and restarts
                            
                                Make use of web fragments (Servlet API 3.0) in a huge project environment
                            
                                Java to Excel 2010
                            
                                access rules on Eclipse for Forbidden and discouraged reference
                            
                                XML parsing using Java with getting element values and attribute values
                            
                                Java spring security - intercept subdomain url for different login?
                            
                                How to execute a interactive shell script using java Runtime?
                            
                                How to test "add" in DAO without using "find" etc.?
                            
                                How to use LibGDX cameras with Box2D Debug Renderers
                            
                                Java ORM: Multiple (interface) inheritance

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

AWS DynamoDB and MapReduce in Java

Tags:

java

amazon-web-services

amazon-dynamodb

mapreduce

elastic-map-reduce

Mark

People also ask

1 Answers

Chen Harel

Recent Activity

Donate For Us