Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to write map/reduce jobs for Amazon Elastic MapReduce using .NET?

Is it possible to write map/reduce jobs for Amazon Elastic MapReduce (http://aws.amazon.com/elasticmapreduce/) using .NET languages? In particular I would like to use C#.

Preliminary research suggests not. The above URL's marketing text suggests you have a "choice of Java, Ruby, Perl, Python, PHP, R, or C++", without mentioning .NET languages. This Amazon thread (http://developer.amazonwebservices.com/connect/thread.jspa?messageID=136051 -- "Support for C# / F# map/reducers") explicitly says that "currently Amazon Elastic MapReduce does not support Mono platform or languages such as C# or F#."

The above suggests that it can't be done. I'm wondering if there are any workarounds, though. For example, can I modify the Elastic MapReduce machine image for my account, and install Mono on there?

An alternative, suggested by Amazon FAQs "Using Other Software Required by Your Jar" (http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/index.html?CHAP_AdvancedTopics.html) and "How to Use Additional Files and Libraries With the Mapper or Reducer" (http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/index.html?addl_files.html), is to make the first step of the Map/Reduce job be to install Mono on the local instance. That sounds kind of inefficient, but maybe it could work?

Maybe a saner alternative would be to try to forgo the convenience of Elastic MapReduce, and manually set up my own Hadoop cluster on EC2. Then I assume I could install Mono without difficulty.

like image 841
Chris Avatar asked Jul 27 '09 20:07

Chris


2 Answers

There would probably be a possible work-around using Hadoop streaming and compiling your C# code with an Ahead Of Time compiler into native code (check: http://www.mono-project.com/AOT). The binary could be run from S3 like a C++ program could, I guess.

The answer by Reed Copsey is not correct. The VB.NET library is for creating jobs, starting & stopping them, but is not about the code actually running in the Hadoop jobs.

like image 109
Teun D Avatar answered Sep 24 '22 15:09

Teun D


Yes, it is possible using the Bootstrap action as previous answerers have suggested.

The blog posting - http://atbrox.com/2011/02/07/an-example-of-using-f-and-c-netmono-with-amazons-elastic-mapreduce-hadoop/ - gives a description of having a C# mapper and a F# reducer with mono

like image 32
Amund Tveit Avatar answered Sep 24 '22 15:09

Amund Tveit