Hadoop or Hadoop Streaming for MapReduce on AWS

Tags:

I'm about to start a mapreduce project which will run on AWS and I am presented with a choice, to either use Java or C++.

I understand that writing the project in Java would make more functionality available to me, however C++ could pull it off too, through Hadoop Streaming.

Mind you, I have little background in either language. A similar project has been done in C++ and the code is available to me.

So my question: is this extra functionality available through AWS or is it only relevant if you have more control over the cloud? Is there anything else I should bear in mind in order to make a decision, like availability of plugins for hadoop that work better with one language or the other?

Thanks in advance

675

asked Dec 28 '09 21:12

aeolist

1 Answers

You have a few options for running Hadoop on AWS. The simplest is to run your MapReduce jobs via their Elastic MapReduce service: http://aws.amazon.com/elasticmapreduce. You could also run a Hadoop cluster on EC2, as described at http://archive.cloudera.com/docs/ec2.html.

If you suspect you'll need to write your own input/output formats, partitioners, and combiners, I'd recommend using Java with the latter system. If your job is relatively simple and you don't plan to use your Hadoop cluster for any other purpose, I'd recommend choosing the language with which you are most comfortable and using EMR.

Either way, good luck!

Disclosure: I am a founder of Cloudera.

Regards, Jeff

196

answered Oct 07 '22 01:10

Jeff Hammerbacher

Related questions
                            
                                Hadoop Streaming Job Failed (Not Successful) in Python
                            
                                libvlc - simple C++ streaming
                            
                                How to know the Duration of audio song before streaming?
                            
                                Play Framework 2.0 BodyParser - push parsing XML streams
                            
                                Any way to speed up this code. Streaming audio Android
                            
                                Php Recording a Live streaming to a file
                            
                                REST Streaming JSON Output
                            
                                What is the difference between a "stateful" and "stateless" system?
                            
                                Streaming data from the database - ASP.NET Core & SqlDataReader.GetStream()
                            
                                Browser Based Streaming Video/Audio (not progressive download)
                            
                                Ideal Chunk Size for Writing Streamed Content to Disk on iPhone
                            
                                WCF Streaming and resume options
                            
                                how would i create an output stream in c like stdout?
                            
                                using FFmpeg, how to decode H264 packets
                            
                                WCF Streaming - who closes the file?
                            
                                How to use "typedbytes" or "rawbytes" in Hadoop Streaming?
                            
                                If I stream a file to s3, will the event trigger once the file is complete?
                            
                                Streaming upload via @Bean-provided RestTemplateBuilder buffers full file
                            
                                Streaming a CSV file in Django
                            
                                "The remote host closed the connection" in Response.OutputStream.Write

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Hadoop or Hadoop Streaming for MapReduce on AWS

Tags:

amazon-web-services

hadoop

streaming

mapreduce

aeolist

People also ask

1 Answers

Jeff Hammerbacher

Recent Activity

Donate For Us