how to find duplicates and gaps in this scenario in mysql

Q: How to find duplicate values on one column of a table?

The find duplicate values in on one column of a table, you use follow these steps: First, use the GROUP BY clause to group all rows by the target column, which is the column that you want to check duplicate. Then, use the COUNT () function in the HAVING clause to check if any group have more than 1 element.

Q: What is the impact of duplicate records in a database table?

The impact of having duplicate records in a database table can vary from a minor inconvenience to disaster. Luckily, MySQL has a few nifty keywords that can combine to scan a table for duplicates. Also, we can count the number of occurrences of duplicate records and delete them where necessary.

Tags:

select

mysql

Hi I have a table that looks like

-----------------------------------------------------------
|  id  |  group_id | source_id | target_id | sortsequence |
-----------------------------------------------------------
|  2   |    1      |    2      |   4       |     1        |   
-----------------------------------------------------------
|  4   |    1      |    20     |   2       |     1        |   
-----------------------------------------------------------
|  5   |    1      |    2      |   14      |     1        |   
-----------------------------------------------------------
|  7   |    1      |    2      |   7       |     3        |   
-----------------------------------------------------------
|  20  |    2      |    20     |   4       |     3        |   
-----------------------------------------------------------
|  21  |    2      |    20     |   4       |     1        |   
-----------------------------------------------------------

Scenario

There are two scenarios that needs to be handled.

Sortsequence column value should be unique against one source_id and group_id. For example if all the records having group_id = 1 AND source_id = 2 should have sortsequence unique. In above example records having id= and 5 which are having group_id = 1 and source_id = 2 have same sortsequence which is 1. This is faulty record. I need to find out these records.
If group_id and source_id is same. The sortsequence columns value should be continous. There should be no gap. For example in above table records having id = 20, 21 having same group_id and source_id and sortsequence value is 3 and 1. Even this is unique but there is a gap in sortsequence value. I need to also find out these records.

MY So Far Effort

I have written a query

SELECT source_id,`group_id`,GROUP_CONCAT(id) AS children 
FROM
    table 
GROUP BY source_id,
  sortsequence,
  `group_id` 
 HAVING COUNT(*) > 1

This query only address the scenario 1. How to handle scenario 2? Is there any way to do it in same query or I have to write other to handle second scenario.

By the way query will be dealing with million of records in table so performance must be very good.

613

asked Mar 26 '13 07:03

Awais Qarni

1 Answers

Got answer from Tere J Comments. Following query covers above mentioned both criteria.

 SELECT 
     source_id, `group_id`, GROUP_CONCAT(id) AS faultyIDS    
 FROM
     table
 GROUP BY
     source_id,group_id 
 HAVING
     COUNT(DISTINCT sortsequence) <> COUNT(sortsequence) OR COUNT(sortsequence) <> MAX(sortsequence) OR MIN(sortsequence) <> 1

May be it can help others.

answered Sep 30 '22 20:09

Awais Qarni

Related questions
                            
                                Selecting with UNION but limiting every subquery and receiving distinct values
                            
                                Avoid dead lock by ordering explicitly
                            
                                How do I suppress MySQL errors?
                            
                                Removing select N+1 without .Include
                            
                                MySQL delete row until certain point
                            
                                fetch_array() not preserving ORDER BY from query
                            
                                mysql prepared statement : Update query
                            
                                Connecting to Database Cube that uses MySQL database from PHP (using JDBC)
                            
                                Function in code I'm debugging seems to not take into account shifts to and from DST
                            
                                Table design advice
                            
                                How can I improve my LIKE with JOIN search in mysql?
                            
                                Google Cloud SQL: Unable to execute statement
                            
                                UPDATE / INSERT from time to time takes few seconds
                            
                                Query firebird slow order by / distinct
                            
                                Calculating the cost of Block Nested Loop Joins
                            
                                Sort table records in special order
                            
                                MySQL update trigger - find changed columns?
                            
                                MySQL query to update column in mysql based on a table?
                            
                                Yii Framework - InnoDB vs MyISAM
                            
                                Where am I going wrong in using a Join in the mysql query - Explain result posted too

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

how to find duplicates and gaps in this scenario in mysql

Tags:

select

mysql

Awais Qarni

People also ask

1 Answers

Awais Qarni

Recent Activity

Donate For Us