Can someone point me to a good web site with good collection of Hadoop algorithms. For example, the most complex thing that I can do with Hadoop right now is Page Rank. Other than that, I can do trivial things like word counting and stuff.
I want to see a web site that show me other usage of hadoop.
Sorting. Sorting is one of the basic MapReduce algorithms to process and analyze data. MapReduce implements sorting algorithm to automatically sort the output key-value pairs from the mapper by their keys. Sorting methods are implemented in the mapper class itself.
MapReduce facilitates concurrent processing by splitting petabytes of data into smaller chunks, and processing them in parallel on Hadoop commodity servers. In the end, it aggregates all the data from multiple servers to return a consolidated output back to the application.
MapReduce is a Hadoop framework used for writing applications that can process vast amounts of data on large clusters. It can also be called a programming model in which we can process large datasets across computer clusters. This application allows data to be stored in a distributed form.
MapReduce program executes in three stages, namely map stage, shuffle stage, and reduce stage.
Here's quite a few machine learning algorithms. Here's academic papers that might be interesting. Finally here's a book on map reduce that looks interesting.
Have a look at this overview: http://atbrox.com/2010/05/08/mapreduce-hadoop-algorithms-in-academic-papers-may-2010-update/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With