Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using QUALIFY Row_Number in hive

I'm working with Teradata conversion to Hive (version 0.10.0).

Teradata Query :

QUALIFY ROW_NUMBER() OVER (PARTITION BY ADJSTMNT,SRC_CMN , TYPE_CMD,IOD_TYPE_CD,ROE_PST ,ORDR_SYC,SOR_CD,PROS_ED ORDER BY ADJSTMNT )=1

I did my search and found UDF for Row_Sequence in hive. I also replaced Over Partition with Distribute All and sort By. But I am stuck with QUALIFY.

Any ideas to convert the above to hive are really appreciated and will help us a lot.

like image 773
dyuti Avatar asked Jul 09 '13 04:07

dyuti


People also ask

Can we use Rownum in hive?

Row_number is one of the analytics function in Hive. It will assign the unique number(1,2,3…) for each row based on the column value that used in the OVER clause. In addition, A partitioned By clause is used to split the rows into groups based on column value.

What does qualify row number do?

We can use row number with qualify function to extract the required results. The Row number function ordered the marks with row number. In order to get the highest marks in each subject, we are using the Qualify function to take the the record that has row number as 1.

How do I select a row number in hive?

Use the ROW_NUMBER() function to get top rows by partition in Hive. SELECT t2. column_A as column_A, t2. cnt as cnt FROM ( SELECT t.

Can we use ROW_NUMBER without partition by?

ROW_NUMBER() Function without Partition By clausePartition by clause is an optional part of Row_Number function and if you don't use it all the records of the result-set will be considered as a part of single record group or a single partition and then ranking functions are applied.


1 Answers

a QUALIFY with analytics function (ROW_NUMBER(), SUM(), COUNT(), ... over (partition by ...)) is just a WHERE on a subquery containing the analytics value.

eg:

select A,B,C
from X 
QUALIFY  ROW_NUMBER() over (...) = 1

is equivalent to :

select A,B,C
from (
   select A,B,C, ROW_NUMBER() over (...) as RNUM
   from X
) t
where RNUM = 1

NB: analytics function are available in Hive 0.12

like image 59
R. Chevallier Avatar answered Sep 29 '22 15:09

R. Chevallier