Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Use Multiple Columns in Partition By And Ensure No Duplicate Row is Returned

I have used multiple columns in Partition By statement in SQL but duplicate rows are returned back. I only want distinct rows being returned back.

This is what I have coded in Partition By:

SELECT DATE, STATUS, TITLE, ROW_NUMBER() OVER (PARTITION BY DATE, STATUS, TITLE ORDER BY QUANTITY ASC) AS Row_Num
    FROM TABLE

This is the output I get currently: (Where there are duplicate rows being returned - Please Refer to Row 6 to 8)

enter image description here

This is the output I want to achieve: (no duplicate row being returned - Please Refer to Row 6 to 8)

enter image description here

Question: How can I place multiple columns in 1 Partition By and Ensure No Duplicate Row is Returned?

Appreciate if someone can provide me help on this, thanks a lot!!

like image 830
Felicia Soh Avatar asked Oct 08 '15 09:10

Felicia Soh


People also ask

Can we do partition on multiple columns?

Multi-column partitioning allows us to specify more than one column as a partition key. Currently multi-column partitioning is possible only for range and hash type. Range partitioning was introduced in PostgreSQL10 and hash partitioning was added in PostgreSQL 11.

How do I remove duplicates in multiple columns in SQL?

The best way to delete duplicate rows by multiple columns is the simplest one: Add an UNIQUE index: ALTER IGNORE TABLE your_table ADD UNIQUE (field1,field2,field3); The IGNORE above makes sure that only the first found row is kept, the rest discarded.


3 Answers

Try this, It worked for me

SELECT * FROM (
            SELECT
                [Code],
                [Name],
                [CategoryCode],
                [CreatedDate],
                [ModifiedDate],
                [CreatedBy],
                [ModifiedBy],
                [IsActive],
                ROW_NUMBER() OVER(PARTITION BY [Code],[Name],[CategoryCode] ORDER BY ID DESC) rownumber
            FROM MasterTable
          ) a
        WHERE rownumber = 1 
like image 143
Ankit Raonka Avatar answered Nov 22 '22 05:11

Ankit Raonka


If your table columns contains duplicate data and If you directly apply row_ number() and create PARTITION on column, there is chance to have result in duplicated row and with row number value.

To remove duplicate row, you need one more INNER query in from clause which eliminates duplicate rows and then it will give output to it's foremost outer FROM clause where you can apply PARTITION and ROW_NUMBER ().

As like below example:

SELECT DATE, STATUS, TITLE, ROW_NUMBER() OVER (PARTITION BY DATE, STATUS, TITLE ORDER BY QUANTITY ASC) AS Row_Num
FROM (
     SELECT DISTINCT <column names>...
) AS tbl
like image 39
Bhavesh Harsora Avatar answered Nov 22 '22 05:11

Bhavesh Harsora


I'd create a cte and do an inner join. It's not efficient but it's convenient

with table as (
SELECT DATE, STATUS, TITLE, ROW_NUMBER() 
OVER (PARTITION BY DATE, STATUS,  TITLE ORDER BY QUANTITY ASC) AS Row_Num
 FROM TABLE)

select *

from table t
join select(
max(Row_Num) as Row_Num
,DATE
,STATUS
,TITLE
from table 
group by date, status, title) t2  
on t2.Row_Num = t.Row_Num and t2
and t2.date = t.date
and t2.title = t.title
like image 30
Andrew Troiano Avatar answered Nov 22 '22 05:11

Andrew Troiano