I have a problem using Hive on Spark. I have installed a single-node HDP 2.1 (Hadoop 2.4) via Ambari on my CentOS 6.5. I'm trying to run Hive on Spark, so I used this instructions:
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started
I already downloaded the "Prebuilt for Hadoop 2.4"-version of Spark, which i found on the official Apache Spark website. So I started the Master with:
./spark-class org.apache.spark.deploy.master.Master
Then the worker with:
./spark-class org.apache.spark.deploy.worker.Worker spark://hadoop.hortonworks:7077
And then I started Hive with this prompt:
hive –-auxpath /SharedFiles/spark-1.0.1-bin-hadoop2.4/lib/spark-assembly-1.1.0-hadoop2.4.0.jar
Then, according to the instructions, i had to change the execution engine of hive to spark with this prompt:
set hive.execution.engine=spark;,
And the result is:
Query returned non-zero code: 1, cause: 'SET hive.execution.engine=spark' FAILED in validation : Invalid value.. expects one of [mr, tez].
So if I try to launch a simple Hive Query, I can see on my hadoop.hortonwork:8088 that the launched job is a MapReduce-Job.
Now to my question: How can I change the execution engine of Hive so that Hive uses Spark instead of MapReduce? Are there any other ways to change it? (I already tried to change it via ambari and at the hive-site.xml)
set hive.execution.engine=spark;
try this command it will run fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With