Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive: Finding max value in a group by

Tags:

sql

max

hive

I have a hive table something like this:

create external table test(
  test_id string,
  test_name string,
  description string,
  clicks int,
  last_referred_click_date string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE LOCATION  '{some_location}';

I need to find out total clicks for a test_id and the last click date(max date in that group of test_id)

I am doing something like this

insert overwrite table test partition(weekending='{input_date}')
  select s.test_id,s.test_name,s.description,max(click_date),
    sum(t.click) as clicks
   group by s.test_id,s.test_name,s.description order by clicks desc; 

Does max() function works for strings? My click_date is of teh format'yyyy-mm-dd' and is a string data type? If not, what can I do here ? UDF ?

like image 823
TopCoder Avatar asked Mar 29 '12 10:03

TopCoder


People also ask

Can we use max with group by?

Max() function with Group by In this page we are discussing, how the GROUP BY clause along with the SQL MAX() can be used to find the maximum value of a column over each group.

What is Max () in hive?

The MAX function accepts just one argument, but the GREATEST function requires at least two. The MAX function calls the mapreduce engine, while the GREATEST function does not. If extra columns, in addition to the column given as an input for the MAX function, must be grouped.

How do I find the maximum length of a column in hive?

select max(length(mycolumn)) from mytable; This works fine in hive QL. Save this answer.


1 Answers

SELECT s.test_id,
       s.test_name,
       s.description,
       MAX(CAST(last_referred_click_date as DateTime)), 
       sum(t.clicks) as Total_Clicks
FROM test s
WHERE s.test_id=1
GROUP BY s.test_id,s.test_name,s.description 
ORDER BY clicks desc; 
like image 92
Shahbaz Chishty Avatar answered Oct 18 '22 23:10

Shahbaz Chishty