There are so many Hadoop versions and different distributions which make me confused. I have a few questions.
Hadoop 3. x we will be using Timeline service version v. 2. This version of Timeline service provides for more scalability, reliability and enhanced usability by introducing flows and aggregation. This version of the Timeline is more scalable than its previous version.
Using HDFS command line is one of the best way to get the detailed version. Using HDP Select command on the host where you want to check the version. Using Ambari API also we can get some idea about the hdfs client version shipped and installed as part of the HDP.
Or, is it dead altogether? In reality, Apache Hadoop is not dead, and many organizations are still using it as a robust data analytics solution. One key indicator is that all major cloud providers are actively supporting Apache Hadoop clusters in their respective platforms.
According to this blogpost from Cloudera:
There is next to no functional difference between 0.20.205 and 1.0. This is just a renumbering.
Hadoop's Yarn site states:
MapReduce has undergone a complete overhaul in hadoop-0.23 and we now have, what we call, MapReduce 2.0 (MRv2) or YARN
It's also worth to have a look at this diagram too. It shows the tree of different Hadoop versions as well as the 3rd party distributions on top of them.
updated answer http://elephantscale.com/hadoop2_handbook/Hadoop_Versions.html
(disclaimer : I am a co-author of this online book)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With