Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache Hadoop vs Google Bigdata

  1. Can any one explain me the key difference between Apache Hadoop vs Google Bigdata
  2. Which one is better(hadoop or google big data).
like image 531
MST Avatar asked May 16 '15 13:05

MST


People also ask

Is Google BigQuery based on Hadoop?

Google BigQuery is serverless, while Hadoop is not. If you use Hadoop, scaling the capacity of your systems is up to you. If you use BigQuery, you don't have to worry about it, because Google is responsible for scalability. This certainly means that BigQuery will be easier to manage for your in-house team.

Is BigQuery faster than Hive?

One of the most important aspects while working with data warehousing solutions and analytics is the ability to handle large datasets. Google BigQuery is the best in business for that particular aspect. It is ridiculously fast while handling large data sets.

Is BigQuery same as hive?

BigQuery eliminates the need for operational, and administrative efforts, thus saving time & cost for the employees, and companies who prefer to leverage it instead of the traditional data warehousing tool of Hadoop, i.e. Hive.

Is big data and BigQuery same?

BigQuery is a robust business intelligence platform that works as a “Big Data as a Service” solution. BigQuery is a fully managed, serverless SQL data warehouse that allows for speedy SQL queries and interactive analysis of large datasets (on the order of terabytes or petabytes).


1 Answers

Simple answer would be.. it depends on what you want to do with your data.

Hadoop is used for massive storage of data and batch processing of that data. It is very mature, popular and you have lot of libraries that support this technology. But if you want to do real time analysis, queries on your data hadoop is not suitable for it.

Google's Big Query was developed specially to solve this issue. You can do real time processing on your data using google's big query.

You can use Big Query in place of Hadoop or you can also use big query with Hadoop to query datasets produced from running MapReduce jobs.

So, it entirely depends on how you want to process your data. If batch processing model is required and sufficient you can use Hadoop and if you want real time processing you have to choose Google's.

Edit: You can also explore other technologies that you can use with Hadoop like Spark, Storm, Hive etc.. (and choose depending on your use case)

Some useful links for more exploration:

1: gavinbadcock's blog

2: cloudacademy's blog

like image 197
karthik manchala Avatar answered Oct 19 '22 02:10

karthik manchala