Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop distributions [closed]

Tags:

hadoop

I am new to hadoop. Could you please tell me what are different distributions available for hadoop.

Am seeing standard apache hadoop and Cloudera distribution for hadoop(CDH).

What is the difference between these two? Is CDH free or commercial?

like image 758
MRK Avatar asked Dec 06 '11 10:12

MRK


People also ask

What are the distributions of Hadoop?

What are Hadoop Distributions? Hadoop distributions are used to provide scalable, distributed computing against on-premises and cloud-based file store data. Distributions are composed of commercially packaged and supported editions of open-source Apache Hadoop-related projects.

Which Hadoop distribution is fully open source?

Apache Hadoop is an open source software platform for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

Which company is a provider of Hadoop distribution?

Top six vendors offering Big Data Hadoop solutions are:Amazon Web Services Elastic MapReduce Hadoop Distribution. Microsoft. MapR. IBM InfoSphere Insights.

How Hadoop handles the distribution and execution?

Hadoop does distributed processing for huge data sets across the cluster of commodity servers and works on multiple machines simultaneously. To process any data, the client submits data and program to Hadoop. HDFS stores the data while MapReduce process the data and Yarn divide the tasks.


1 Answers

Besides Apache Hadoop, it's more or less a three horse race for Hadoop distribution between HortonWorks, Cloudera and MapR. Then there are GreenPlum HD and IBM InfoSphere BigInsights.

Is CDH free or commercial?

CDH from Cloudera is free to use. But, need to pay for any support and management tools on top of CDH.

What is the difference between these two?

In Apache all the projects (Pig, Hive etc) are independent. Cloudera makes sure all these frameworks work properly with each other and packages them as CDH. With CDH there are regular release, which I haven't seen in Apache. Another thing is it's difficult to get support for Apache Hadoop, while Cloudera and others provide commercial support for their own versions of Hadoop.

like image 156
Praveen Sripati Avatar answered Oct 19 '22 05:10

Praveen Sripati