Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HIVE: How to include null rows in lateral view explode

Tags:

hive

hiveql

I have a table as follows:

user_id email
u1      e1, e2
u2      null

My goal is to convert this into the following format:

user_id email
u1      e1
u1      e2
u2      null

So for this I am using the lateral view explode() function in Hive, as follows:

select *  FROM  table
       LATERAL VIEW explode (split(email ,',')) email AS email_id 

But doing this the u2 row is getting skipped as it has null value in email. How can we include the nulls too in the output?

Edit: I am using a workaround doing an union of this table with the base table without explode, but I think the data will be scanned one more time because of this. I wanted to know if there is a better way to do it.

like image 745
Dileep Kumar Patchigolla Avatar asked Nov 21 '16 08:11

Dileep Kumar Patchigolla


People also ask

What does the explode function in lateral view do?

The LATERAL VIEW clause is used in conjunction with generator functions such as EXPLODE , which will generate a virtual table containing one or more rows. LATERAL VIEW will apply the rows to each original output row.

How do you handle null in Hive?

Use nvl() function in Hive to replace all NULL values of a column with a default value, In this article, I will explain with an example. Replace all NULL values with -1 or 0 or any number for the integer column. Replace all NULL values with empty space for string types. Replace with any value based on your need.

How do you explode an array column in Hive?

Explode is a User Defined Table generating Function(UDTF) in Hive. It takes an array (or a map) as an input and outputs the elements of the array (or a map) as separate rows. UDTFs can be used in the SELECT expression list and as a part of LATERAL VIEW. LATERAL VIEW statement is used with UDTF such as explode().


1 Answers

include OUTER in the query to get rows with NULL values

something like,

select *  FROM table LATERAL VIEW OUTER explode (  split (  email  ,','  ) ) email AS email_id;

check this link -> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews

like image 95
Nirmal Ram Avatar answered Sep 23 '22 02:09

Nirmal Ram