Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace the empty or NULL value with specific value in HIVE query result

I'm trying to show a default value, "Others", when the query does not return any result for one of the selected columns. I'll show you the example.

This query returns an empty value for os(agent) SO (in the first row):

select country, os(agent) SO, count(*) from clicks_data
where country is not null and os(agent) is not null
group   by country, os(agent);

Output:

ZA           4
ZA  Android  4
ZA  Mac      8
ZA  Windows  5

Instead, I would like to get this result:

ZA  Others  4
ZA  Android 4
ZA  Mac     8
ZA  Windows 5

My next attempt was this query, but it's not really working, either:

select country, regexp_replace(os(agent),'','Others') SO, count(*) from clicks_data 
where country is not null and os(agent) is not null 
group by country, os(agent);

This is the result:

ZA  Others  4
ZA  OthersAOthersnOthersdOthersrOthersoOthersiOthersdOthers 4
ZA  OthersMOthersaOtherscOthers 8
ZA  OthersWOthersiOthersnOthersdOthersoOtherswOtherssOthers 5
like image 405
Sebastian Loeb Sucre Avatar asked May 17 '15 09:05

Sebastian Loeb Sucre


People also ask

How do you replace blank NULL values in Hive?

Use nvl() function in Hive to replace all NULL values of a column with a default value, In this article, I will explain with an example. Replace all NULL values with -1 or 0 or any number for the integer column. Replace all NULL values with empty space for string types. Replace with any value based on your need.

Can we use NVL function in Hive?

The hive nvl function is one of the same functions. We can use the nvl function as the keyword in the hive query. It will update, we need to replace the null value in the table with the specific value. With the help of the nvl keyword, we can easily replace the null values from the hive table.

What is the difference between NVL and Coalesce in Hive?

NVL and COALESCE are used to achieve the same functionality of providing a default value in case the column returns a NULL. The differences are: NVL accepts only 2 arguments whereas COALESCE can take multiple arguments. NVL evaluates both the arguments and COALESCE stops at first occurrence of a non-Null value.

What to replace NULL values with?

There are two ways to replace NULL with blank values in SQL Server, function ISNULL(), and COALESCE(). Both functions replace the value you provide when the argument is NULL like ISNULL(column, '') will return empty String if the column value is NULL.


1 Answers

Use LENGTH() to check the length of the column value. It returns > 0, if there is some value else return 0 for empty or NULL value.

Also frame the column value in CASE WHEN ... END block

The final query may look like:

SELECT country, CASE WHEN LENGTH(os(agent)) > 0 THEN os(agent) ELSE 'Others' END AS SO, COUNT(*) 
FROM clicks_data 
WHERE country IS NOT NULL AND os(agent) IS NOT NULL 
GROUP BY country, os(agent);

Hope this help you!!!

like image 71
Farooque Avatar answered Jan 26 '23 12:01

Farooque