hive sql find the latest record

Tags:

the table is:

create table test ( id string, name string, age string, modified string)

data like this:

Click to copy

id    name   age  modifed 1     a      10   2011-11-11 11:11:11 1     a      11   2012-11-11 12:00:00 2     b      20   2012-12-10 10:11:12 2     b      20   2012-12-10 10:11:12 2     b      20   2012-12-12 10:11:12 2     b      20   2012-12-15 10:11:12

I want to get the latest record(include every colums id,name,age,modifed) group by id,as the data above,the correct result is:

Click to copy

1     a      11   2012-11-11 12:00:00 2     b      20   2012-12-15 10:11:12

I do like this:

Click to copy

insert overwrite table t  select b.id, b.name, b.age, b.modified  from (         select id,max(modified) as modified          from test          group by id ) a  left outer join test b on (a.id=b.id  and a.modified=b.modified);

This sql can get the right result,but when mass data，it runs slow.

**Is there any way to do this without left outer join? **

477

asked Nov 23 '12 04:11

qiulp

1 Answers

There's a nearly undocumented feature of Hive SQL (I found it in one of their Jira bug reports) that lets you do something like argmax() using struct()s. For example if you have a table like:

Click to copy

test_argmax id,val,key 1,1,A 1,2,B 1,3,C 1,2,D 2,1,E 2,1,U 2,2,V 2,3,W 2,2,X 2,1,Y

You can do this:

Click to copy

select    max(struct(val, key, id)).col1 as max_val,   max(struct(val, key, id)).col2 as max_key,   max(struct(val, key, id)).col3 as max_id from test_argmax group by id

and get the result:

Click to copy

max_val,max_key,max_id 3,C,1 3,W,2

I think in case of ties on val (the first struct element) it will fall back to comparison on the second column. I also haven't figured out whether there's a neater syntax for getting the individual columns back out of the resulting struct, maybe using named_struct somehow?

170

answered Oct 11 '22 13:10

patricksurry

Related questions
                            
                                Diff between Top 1 1 and Select 1 in SQL Select Query
                            
                                How to add minutes to the time part of datetime
                            
                                SQL Server Management Studio 2012 hangs
                            
                                Insert sequential number in MySQL
                            
                                How to run SQL scripts and get data on application startup?
                            
                                LINQ to SQL query using "NOT IN"
                            
                                How do I make a row generator in MySQL?
                            
                                dbvisualizer: set max rows in a select query
                            
                                How to add a new column with foreign key constraint in a single statement in oracle
                            
                                What is the best way to store a money value in the database?
                            
                                LEFT INNER JOIN vs. LEFT OUTER JOIN - Why does the OUTER take longer?
                            
                                How to select all rows which have same value in some column
                            
                                How can you find the number of occurrences of a particular character in a string using sql?
                            
                                How to subtract hours from a date in Oracle so it affects the day also
                            
                                Oracle, Make date time's first day of its month
                            
                                convert string date to java.sql.Date [duplicate]
                            
                                How can I generate a series of repeating numbers in PostgreSQL?
                            
                                Laravel 5.1: handle joins with same column names
                            
                                MySQL FULLTEXT indexes issue
                            
                                SQLite3 UNIQUE constraint failed error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

hive sql find the latest record

Tags:

sql

group-by

max

hive

qiulp

People also ask

1 Answers

patricksurry

Recent Activity

Donate For Us