I'm looking for the SQL equivalent of SET varname = value in Hive QL
I know I can do something like this:
SET CURRENT_DATE = '2012-09-16'; SELECT * FROM foo WHERE day >= @CURRENT_DATE But then I get this error:
character '@' not supported here
We can use the hive SET command to either override an existing hive property(can not override environmental variables) or display the configuration property of the hive containing system and environmental variables. The separated property can also be checked with the SET command in the hive.
Hiveconf is the namespace used when you use set without explicit namespace or when you give a variable on the command line with –hiveconf foo=bar. Note that you can set those without specifying the namespace, but you always need to specify the namespace when using them.
You need to use the special hiveconf for variable substitution. e.g.
hive> set CURRENT_DATE='2012-09-16'; hive> select * from foo where day >= ${hiveconf:CURRENT_DATE} similarly, you could pass on command line:
% hive -hiveconf CURRENT_DATE='2012-09-16' -f test.hql Note that there are env and system variables as well, so you can reference ${env:USER} for example.
To see all the available variables, from the command line, run
% hive -e 'set;' or from the hive prompt, run
hive> set; Update: I've started to use hivevar variables as well, putting them into hql snippets I can include from hive CLI using the source command (or pass as -i option from command line). The benefit here is that the variable can then be used with or without the hivevar prefix, and allow something akin to global vs local use.
So, assume have some setup.hql which sets a tablename variable:
set hivevar:tablename=mytable; then, I can bring into hive:
hive> source /path/to/setup.hql; and use in query:
hive> select * from ${tablename} or
hive> select * from ${hivevar:tablename} I could also set a "local" tablename, which would affect the use of ${tablename}, but not ${hivevar:tablename}
hive> set tablename=newtable; hive> select * from ${tablename} -- uses 'newtable' vs
hive> select * from ${hivevar:tablename} -- still uses the original 'mytable' Probably doesn't mean too much from the CLI, but can have hql in a file that uses source, but set some of the variables "locally" to use in the rest of the script.
Most of the answers here have suggested to either use hiveconf or hivevar namespace to store the variable. And all those answers are right. However, there is one more namespace.
There are total three namespaces available for holding variables.
And so if you are storing a variable as part of a query (i.e. date or product_number) you should use hivevar namespace and not hiveconf namespace.
And this is how it works.
hiveconf is still the default namespace, so if you don't provide any namespace it will store your variable in hiveconf namespace.
However, when it comes to referring a variable, it's not true. By default it refers to hivevar namespace. Confusing, right? It can become clearer with the following example.
If you do not provide namespace as mentioned below, variable var will be stored in hiveconf namespace.
set var="default_namespace"; So, to access this you need to specify hiveconf namespace
select ${hiveconf:var}; And if you do not provide namespace it will give you an error as mentioned below, reason being that by default if you try to access a variable it checks in hivevar namespace only. And in hivevar there is no variable named var
select ${var}; We have explicitly provided hivevar namespace
set hivevar:var="hivevar_namespace"; as we are providing the namespace this will work.
select ${hivevar:var}; And as default, workspace used during referring a variable is hivevar, the following will work too.
select ${var};
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With