Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Row number functionality in Hive

Tags:

hive

hiveql

How can I generate row numbers for an existing table while running a select query?
For example:

select row_number(), * from emp;

I am using hive 0.13. I can't access external jars or udfs in my environment. The underlying files are in parquet format.

Thanks in advance!

like image 327
Gayatri Mahesh Avatar asked May 27 '16 07:05

Gayatri Mahesh


2 Answers

ROW_NUMBER() is a windowing function so it needs to be used in conjunction with an OVER clause. Just don't specify any PARTITION.

SELECT *, ROW_NUMBER() OVER () AS row_num
FROM emp
--- other stuff
like image 119
o-90 Avatar answered Oct 05 '22 21:10

o-90


row_number() can be used to find for example, recent visit of a user on your site.

SELECT user_id,user_name,timestamp
FROM (
SELECT user_id,user_name,timestamp,row_number() over (partition by userid order by timestamp desc) as visit_number 
from user) user_table
    WHERE visit_number = 1
like image 39
sumitya Avatar answered Oct 05 '22 23:10

sumitya