Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to configure Hive to use Spark?

I have a problem using Hive on Spark. I have installed a single-node HDP 2.1 (Hadoop 2.4) via Ambari on my CentOS 6.5. I'm trying to run Hive on Spark, so I used this instructions:

https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started

I already downloaded the "Prebuilt for Hadoop 2.4"-version of Spark, which i found on the official Apache Spark website. So I started the Master with:

./spark-class org.apache.spark.deploy.master.Master

Then the worker with:

./spark-class org.apache.spark.deploy.worker.Worker spark://hadoop.hortonworks:7077

And then I started Hive with this prompt:

hive –-auxpath /SharedFiles/spark-1.0.1-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar

Then, according to the instructions, i had to change the execution engine of hive to spark with this prompt:

set hive.execution.engine=spark;,

And the result is:

Query returned non-zero code: 1, cause: 'SET hive.execution.engine=spark' FAILED in validation : Invalid value.. expects one of [mr, tez].

So if I try to launch a simple Hive Query, I can see on my hadoop.hortonwork:8088 that the launched job is a MapReduce-Job.

Now to my question: How can I change the execution engine of Hive so that Hive uses Spark instead of MapReduce? Are there any other ways to change it? (I already tried to change it via ambari and at the hive-site.xml)

like image 231
Baeumla Avatar asked Dec 05 '22 05:12

Baeumla


1 Answers

set hive.execution.engine=spark;

try this command it will run fine.

like image 149
Sree Eedupuganti Avatar answered Jan 23 '23 07:01

Sree Eedupuganti