I've got a table that has rows that are unique except for one value in one column (let's call it 'Name'). Another column is 'Date' which is the date it was added to the database.
What I want to do is find the duplicate values in 'Name', and then delete the ones with the oldest dates in 'Date', leaving the most recent one.
Seems like a relatively easy query, but I know very little about SQL apart from simple queries.
Any ideas?
According to Delete Duplicate Rows in SQL, for finding duplicate rows, you need to use the SQL GROUP BY clause. The COUNT function can be used to verify the occurrence of a row using the Group by clause, which groups data according to the given columns.
1) First identify the rows those satisfy the definition of duplicate and insert them into temp table, say #tableAll . 2) Select non-duplicate(single-rows) or distinct rows into temp table say #tableUnique. 3) Delete from source table joining #tableAll to delete the duplicates.
So to delete the duplicate record with SQL Server we can use the SET ROWCOUNT command to limit the number of rows affected by a query. By setting it to 1 we can just delete one of these rows in the table. Note: the select commands are just used to show the data prior and after the delete occurs.
One way to find duplicate records from the table is the GROUP BY statement. The GROUP BY statement in SQL is used to arrange identical data into groups with the help of some functions. i.e if a particular column has the same values in different rows then it will arrange these rows in a group.
Find duplicates and delete oldest one
Here is the Code
create table #Product (
ID int identity(1, 1) primary key,
Name varchar(800),
DateAdded datetime default getdate()
)
insert #Product(Name) select 'Chocolate'
insert #Product(Name,DateAdded) select 'Candy', GETDATE() + 1
insert #Product(Name,DateAdded) select 'Chocolate', GETDATE() + 5
select * from #Product
;with Ranked as (
select ID,
dense_rank()
over (partition by Name order by DateAdded desc) as DupeCount
from #Product P
)
delete R
from Ranked R
where R.DupeCount > 1
select * from #Product
delete from table a1 where exists (select * from table a2 where a2.name = a1.name and a2.date > a1.date)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With