Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which hadoop version to use?

Tags:

hadoop

Both hadoop in action & the definitive guide, both have built their foundation from the mapred classes. And most of those classes have been deprecated in 0.20.2. The signatures of the new classes are different. Can anyone tell me about the various changes done. E.g. the partitioner class has been deprecated. How is the new reducer going to provide its feature. Concept changes that happened in 0.20.2

What should i use? On the hadoop wiki, i see Download 1.0.X - current stable version, 1.0 release 1.1.X - current beta version, 1.1 release 2.X.X - current alpha version 0.23.X - simmilar to 2.X.X but missing NN HA. 0.22.X - does not include security 0.20.203.X - legacy stable version 0.20.X - legacy version

Does that means the mapred classes were deprecated & have been reintroduced. Which hadoop version should i use? 0.20.2 or 1.0.x ?

like image 473
S Kr Avatar asked Sep 29 '12 21:09

S Kr


People also ask

Which Hadoop version should I use?

Hadoop 3. x we will be using Timeline service version v. 2. This version of Timeline service provides for more scalability, reliability and enhanced usability by introducing flows and aggregation. This version of the Timeline is more scalable than its previous version.

Is Hadoop still relevant 2021?

Or, is it dead altogether? In reality, Apache Hadoop is not dead, and many organizations are still using it as a robust data analytics solution. One key indicator is that all major cloud providers are actively supporting Apache Hadoop clusters in their respective platforms.

What is the difference between Hadoop 2 and 3?

Hadoop cannot cache the data in memory. Hadoop 3 can work up to 30% faster than Hadoop 2 due to the addition of native Java implementation of the map output collector to the MapReduce. Spark can process the information in memory 100 times faster than Hadoop. If working with a disk, Spark is 10 times faster than Hadoop.


1 Answers

Please check this out, it explains the version control of Hadoop development: http://www.cloudera.com/blog/2012/04/apache-hadoop-versions-looking-ahead-3/

So you can get idea of why it has quite complex versions.

p/s: I'm using v1.0.3 for my system :)

That is an April Fools Day post. :)

But anyone can agree the versions are misleading at best.

like image 199
Admin Avatar answered Sep 23 '22 22:09

Admin