Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Hive faster than Spark?

After reading What is hive, Is it a database?, a colleague yesterday mentioned that he was able to filter a 15B table, join it with another table after doing a "group by", which resulted in 6B records, in only 10 minutes! I wonder if this would be slower in Spark, since now with the DataFrames, they may be comparable, but I am not sure, thus the question.

Is Hive faster than Spark? Or this question doesn't have meaning? Sorry, for my ignorance.

He uses the latest Hive, which from seems to be using Tez.

like image 502
gsamaras Avatar asked Sep 09 '16 16:09

gsamaras


People also ask

Is Spark faster than Hive Why?

Hive is the best option for performing data analytics on large volumes of data using SQLs. Spark, on the other hand, is the best option for running big data analytics. It provides a faster, more modern alternative to MapReduce.

Is Spark the fastest?

What is Apache Spark? Apache Spark — which is also open source — is a data processing engine for big data sets. Like Hadoop, Spark splits up large tasks across different nodes. However, it tends to perform faster than Hadoop and it uses random access memory (RAM) to cache and process data instead of a file system.

Is Apache Hive fast?

Furthermore, Apache Hive provides a similar interface to receive data in Hadoop cluster. Hence, it is a great way to start faster data analyzing. Let's look into the technical aspects that make Hive faster during the processing of queries.

Which is faster Tez or Spark?

Spark is more for mainstream developers, while Tez is a framework for purpose-built tools. Spark can't run concurrently with YARN applications (yet). Tez is purposefully built to execute on top of YARN. Tez's containers can shut down when finished to save resources.


1 Answers

Hive is just a framework that gives sql functionality to MapReduce type workloads.

These workloads can run on mapreduce or yarn.

So comparing Hive on tez vs Hive on spark. Nice article below discussing this When to go with ETL on Hive using Tez VS When to go with Spark ETL? (Gist use Hive on spark if not sure).

Benchmark information

Lower the better

like image 94
Krishna Kalyan Avatar answered Oct 28 '22 16:10

Krishna Kalyan