Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sql: How to remove duplicate rows (content the same, but sequence is different)

You see that the SKU1 has 2 rows, but actually the content of these 2 rows are the same, just the sequence of "b" and "c" makes difference.

What if I want to remove the duplicate rows as shown in the 2nd picture?

In Oracle there is a LEAST/GREATEST function that can realize it, but I used SQL Server, therefore it doesn't work following the instruction of the below post:

How to remove duplicate rows in SQL

enter image description here

like image 421
Héléna Avatar asked May 26 '26 09:05

Héléna


2 Answers

If it's only 2 columns where the order should not matter for the group by?

Then you could use IIF (or a CASE WHEN) to calculate the maximum and minimum values.
And use those calculated values in the GROUP BY.

For example:

select Name, 
MAX(Val1) as Val1,
MIN(Val2) as Val2
from Table1
GROUP BY Name,
IIF(Val2 is null or Val1 < Val2, Val1, Val2),
IIF(Val1 is null or Val1 < Val2, Val2, Val1);

For the example records that would give the result:

Name Val1 Val2
SKU1 20   10
SKU2 20   10

Or if you want to use a fancy XML trick :

select Name, max(Val1) as Val1, min(Val2) as Val2
from (
  select *, 
  cast(
    convert(XML, 
      concat('<n>',Val1,'</n><n>',Val2,'</n>')
    ).query('for $n in /n order by $n return string($n)'
  ) as varchar(6)) as SortedValues
  from Table1
) q
group by Name, SortedValues;

The last method could be more usefull when there are more columns involved.

To actually remove the duplicates?
Here's an example that uses a table variable to demonstrate:

declare @Table1 TABLE (Id int, Name varchar(20), Val1 int, Val2 int);

Insert Into @Table1 values
(1,'SKU1',10,20),
(2,'SKU1',20,10),
(3,'SKU1',12,15),
(4,'SKU2',10,null),
(5,'SKU2',null,10),
(6,'SKU2',10,20);

delete from @Table1
where Id in (
    select Id
    from (
    select Id, 
    row_number() over (partition by Name, 
         IIF(Val2 is null or Val1 < Val2, Val1, Val2),
         IIF(Val1 is null or Val1 < Val2, Val2, Val1)
       order by Val1 desc, Val2 desc
    ) as rn
    from @Table1
    ) q
    where rn > 1
);

select * from @Table1;
like image 170
LukStorms Avatar answered May 28 '26 04:05

LukStorms


Please Use Max() and Min Function instead of least and greatest of oracle if used the follwoing steps and got the same result.

Create Table Transactions (Name varchar(255),Quantity1 int,Quantity2 int)
   Insert Into Transactions values
   ('SKU1',10,20),
      ('SKU1',20,10),
   ('SKU2',10,20),
   ('SKU2',10,20)

Now I used the query below to find the solution of your answer

    Select T1.Name,MAX(T1.Quantity1),MIN(T2.Quantity2) From Transactions   T1
    join Transactions T2
    on T1.Name=T2.Name
    group by T1.Name

Please Reply

like image 33
Rohit Gupta Avatar answered May 28 '26 03:05

Rohit Gupta



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!