Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between Hadoop-common, Hadoop-core and Hadoop-client?

Tags:

maven

hadoop

I am newer to Hadoop, and want to know what is the differences between Hadoop-common, Hadoop-core and Hadoop-client?

By the way,for a given class, how do I know which artifact contains it in Maven ? For example, which one contains the org.apache.hadoop.io.Text?

like image 887
chenzhongpu Avatar asked Mar 04 '15 13:03

chenzhongpu


People also ask

What are the 3 build of Hadoop?

Apache Hadoop (HDFS and YARN) Apache HBase. Apache Spark.

What is Hadoop common utilities?

Hadoop common or Common utilities are nothing but our java library and java files or we can say the java scripts that we need for all the other components present in a Hadoop cluster. these utilities are used by HDFS, YARN, and MapReduce for running the cluster.

What are the different components of Hadoop?

There are three components of Hadoop: Hadoop HDFS - Hadoop Distributed File System (HDFS) is the storage unit. Hadoop MapReduce - Hadoop MapReduce is the processing unit. Hadoop YARN - Yet Another Resource Negotiator (YARN) is a resource management unit.

Which of the following are core components of Hadoop?

HDFS (storage) and YARN (processing) are the two core components of Apache Hadoop.


2 Answers

To help provide some additional details regarding the differences between Hadoop-common, Hadoop-core and Hadoop-client, from a high-level perspective:

  • Hadoop-common refers to the commonly used utilities and libraries that support the Hadoop modules.
  • Hadoop-core is the same as Hadoop-common; It was renamed to Hadoop-common in July 2009, per https://hadoop.apache.org/.
  • Hadoop-client refers to the client libraries used to communicate with Hadoop's common components (HDFS, MapReduce, YARN) including but not limited to logging and codecs for example.

Generally speaking, for developers who build apps that submit to YARN, run a MR job, or access files from HDFS use Hadoop-client libraries.

like image 129
Anthony R. Avatar answered Sep 18 '22 14:09

Anthony R.


From techopedia

Hadoop Common refers to the collection of common utilities and libraries that support other Hadoop modules. It is an essential part or module of the Apache Hadoop Framework, along with the Hadoop Distributed File System (HDFS), Hadoop YARN and Hadoop MapReduce.

Like all other modules, Hadoop Common assumes that hardware failures are common and that these should be automatically handled in software by the Hadoop Framework.

Hadoop Common is also known as Hadoop Core.

Hadoop Client libraries helps to load data into the cluster, submit Map Reduce jobs describing how that data should be processed, and then retrieve or view the results of the job when its finished. Have a look at this article

This Apache link provides the list of dependencies of Hadoop Client library.

like image 37
Ravindra babu Avatar answered Sep 19 '22 14:09

Ravindra babu