Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expression "IS NOT NULL" not working on HQL

Tags:

null

hql

hive

When I do a select statement for non-null values on a hive table, there are no correct results in the response. The result is as if "is not null" expression is not there!

Example:

select count(*)
from test_table
where test_col_id1='12345' and test_col_id2 is not null;

Note test_col_id1 and test_col_id2 are not partition keys.

Here's my hive version.

Hive 0.14.0.2.2.0.0-2041

Here's the table:

... | test_col_id1 | test_col_id2 |
... | 12345 | x |
... | 12345 | NULL |

This query returns 2 records.

like image 885
Naveen Karnam Avatar asked May 19 '16 12:05

Naveen Karnam


1 Answers

Try the following query, does it return rows?

select count(*)
from test_table
where test_col_id1='12345' and test_col_id2 != 'NULL';

Then your NULL is not NULL, it's the string 'NULL'. Loads of people have problems with Hive treatment of NULL strings. By default, it's the blank string ''. If we want anything else, we have to specify exactly the way NULL strings should be treated when we create the table. Here are 3 examples of how to change what is recognized as NULL. The first one sets 'NULL' strings as NULL:

CREATE TABLE nulltest1 (id STRING, another_string STRING)
TBLPROPERTIES('serialization.null.format'='NULL') --sets the string 'NULL' as NULL;
CREATE TABLE nulltest2 (id STRING, another_string STRING)
TBLPROPERTIES('serialization.null.format'='') --sets empty string as NULL;
CREATE TABLE nulltest3 (id STRING, another_string STRING)
TBLPROPERTIES('serialization.null.format'='\N'); --sets \N as NULL;

Since you've already created your table, you can alter your table so that it will recognize your 'NULL' as NULL:

ALTER TABLE test_table SET TBLPROPERTIES ('serialization.null.format' = 'NULL');
like image 109
Edward R. Mazurek Avatar answered Oct 21 '22 06:10

Edward R. Mazurek