Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting Duplicate Records in Oracle based on Maximum Date/Time

i have the following sample data with duplicate information:

ID   Date                 Emp_ID    Name    Keep
---------------------------------------------------------
1    17/11/2010 13:45:22  101       AB      *
2    17/11/2010 13:44:10  101       AB
3    17/11/2010 12:45:22  102       SF      *
4    17/11/2010 12:44:10  102       SF
5    17/11/2010 11:45:22  103       RD      *
6    17/11/2010 11:44:10  103       RD        

Based on the above data set, how can I remove the duplicate Emp IDs and only keep the Emp IDs that have the maximum date/time specified?

So based on the above, I would only see IDs: 1, 3 and 5.

Thanks.

like image 803
tonyf Avatar asked Nov 16 '10 14:11

tonyf


People also ask

How can I delete duplicate records in Oracle?

To delete duplicate records in Oracle, start by making sure the records are actually duplicates by entering the Standard Query Language, or SQL. After entering “SQL,” search for what you want to delete, like “delete from names where name = 'Alan. '” Then, enter “commit” for this command to take effect.

Can CTE be use to delete duplicate records?

Deletion of duplicate records with CTESelect/Insert/Update/Delete statements can be used with CTE. For deleting duplicate records execute the following query. This query will delete all the duplicates from the table.


2 Answers

Something like:

DELETE FROM the_table_with_no_name 
WHERE date_column != (SELECT MAX(t2.date_column) 
                      FROM the_table_with_no_name t2 
                      WHERE t2.id = the_table_with_no_name.id);
like image 107
a_horse_with_no_name Avatar answered Sep 30 '22 15:09

a_horse_with_no_name


You can generate the ROWIDs of all rows other than the one with the maximum date (for a given EMPIds) and delete them. I have found this to be performant as it is a set-based approach and uses analytics, rowIDs.

--get list of all the rows to be deleted.

select row_id from (
select rowid row_id,
       row_number() over (partition by emp_id order by date desc) rn
  from <table_name>
) where rn <> 1

And then delete the rows.

delete from table_name where rowid in (
    select row_id from (
    select rowid row_id,
           row_number() over (partition by emp_id order by date desc) rn
      from <table_name>
    ) where rn <> 1
);
like image 43
Rajesh Chamarthi Avatar answered Sep 30 '22 16:09

Rajesh Chamarthi