Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between -hivevar and -hiveconf?

Tags:

From hive -h :

--hiveconf <property=value>   Use value for given property
--hivevar <key=value>         Variable subsitution to apply to hive
                                  commands. e.g. --hivevar A=B
like image 586
Karnimrod Avatar asked Jun 23 '16 16:06

Karnimrod


People also ask

What is the difference between single and double quotation marks?

General Usage Rules In America, Canada, Australia and New Zealand, the general rule is that double quotes are used to denote direct speech. Single quotes are used to enclose a quote within a quote, a quote within a headline, or a title within a quote.

What's the difference between apostrophe and quotation marks?

But I bet you're curious about how they're different, why else would you be here? The main difference between the two is: Quotation marks are used to report speech. An apostrophe is used for making contractions and possession.

What is the difference between single and double inverted commas?

Double quotation marks (in British English) are used to indicate direct speech within direct speech (use single inverted commas for direct speech and double quotation marks to enclose quoted material within).

What is the difference between (;) and (:)?

Colons and semicolons are two types of punctuation. Colons (:) are used in sentences to show that something is following, like a quotation, example, or list. Semicolons (;) are used to join two independent clauses, or two complete thoughts that could stand alone as complete sentences.


3 Answers

I didn't quite feel like the examples from the documentation were adequate, so here's my attempt at an answer.

In the beginning there was only --hiveconf and variable substitution didn't exist.

The --hiveconf option allowed users to set Hive configuration values from the command line and that was it. All Hive configuration values are stored under the hiveconf namespace, i.e. hiveconf:mapred.reduce.tasks. These values allowed you to control things like the number of mappers and reducers, if status messages should be displayed, and if the script should continue on errors.

Later, variable substitution was added. This meant you could now use variables in queries with the ${...} syntax. However, the only variables you could set from the command line were under the hiveconf namespace using --hiveconf, so that's where users put their variables.

Putting your personal variables under the Hive configuration namespace probably won't break anything, but it's also not good form. Later, it was suggested that a hivevar namespace be added specifically for user variables which could also be defined at the command line using --hivevar. This meant a cleaner separation between Hive configuration values and user defined variables.

In summary:
The hiveconf namespace and --hiveconf should be used to set Hive configuration values.
The hivevar namespace and --hivevar should be used to define user variables.
Setting user variables under the hiveconf namespace probably won't break anything, but isn't recommended.

like image 89
Mr. Llama Avatar answered Sep 28 '22 09:09

Mr. Llama


@Llama has explained it in detailed, along with that both type of variables are accessed differently.

The --hivevar variables are accessed using ${var-name}, while the --hiveconf are accessed ${hiveconf:var-name} inside hive.

e.g. Below examples access variable and print it's value in hive.

hivevar:

hive --hivevar a='this is a' -e '!echo ${a};' 

output:this is a

hiveconf:

hive --hiveconf a='this is a' -e '!echo ${hiveconf:a};' 

output:this is a

like image 43
Rahul Sharma Avatar answered Sep 28 '22 09:09

Rahul Sharma


We can also use them at the beginning of the script as:

hiveconf:

SET this_dt = CURRENT_DATE;
select ${hiveconf:this_dt};

hivevar:

set hivevar:cur_dt=current_date;
select ${hivevar:cur_dt};
like image 27
RPaul Avatar answered Sep 28 '22 09:09

RPaul