Hi I have a table that looks like
-----------------------------------------------------------
| id | group_id | source_id | target_id | sortsequence |
-----------------------------------------------------------
| 2 | 1 | 2 | 4 | 1 |
-----------------------------------------------------------
| 4 | 1 | 20 | 2 | 1 |
-----------------------------------------------------------
| 5 | 1 | 2 | 14 | 1 |
-----------------------------------------------------------
| 7 | 1 | 2 | 7 | 3 |
-----------------------------------------------------------
| 20 | 2 | 20 | 4 | 3 |
-----------------------------------------------------------
| 21 | 2 | 20 | 4 | 1 |
-----------------------------------------------------------
Scenario
There are two scenarios that needs to be handled.
Sortsequence
column value should be unique against one source_id
and group_id
. For example if all the records having group_id = 1 AND source_id = 2
should have sortsequence unique. In above example records having id= and 5 which are having group_id = 1 and source_id = 2 have same sortsequence which is 1
. This is faulty record. I need to find out these records.group_id and source_id
is same. The sortsequence columns value should be continous. There should be no gap
. For example in above table records having id = 20, 21 having same group_id and source_id and sortsequence value is 3 and 1
. Even this is unique but there is a gap in sortsequence value. I need to also find out these records.MY So Far Effort
I have written a query
SELECT source_id,`group_id`,GROUP_CONCAT(id) AS children
FROM
table
GROUP BY source_id,
sortsequence,
`group_id`
HAVING COUNT(*) > 1
This query only address the scenario 1. How to handle scenario 2? Is there any way to do it in same query or I have to write other to handle second scenario.
By the way query will be dealing with million of records in table so performance must be very good.
Now you can check for duplicates in MySQL data in one or multiple tables and understand the INNER JOIN function. Make sure you created the tables correctly and that you select the right columns. Now that you have found duplicate values, learn how to remove MySQL duplicate rows.
If there is a sequence having gap of maximum one between two numbers (like 1,3,5,6) then the query that can be used is: select s.id+1 from source1 s where s.id+1 not in (select id from source1) and s.id+1< (select max (id) from source1); Create a temporary table with 100 rows and a single column containing the values 1-100.
The find duplicate values in on one column of a table, you use follow these steps: First, use the GROUP BY clause to group all rows by the target column, which is the column that you want to check duplicate. Then, use the COUNT () function in the HAVING clause to check if any group have more than 1 element.
The impact of having duplicate records in a database table can vary from a minor inconvenience to disaster. Luckily, MySQL has a few nifty keywords that can combine to scan a table for duplicates. Also, we can count the number of occurrences of duplicate records and delete them where necessary.
Got answer from Tere J
Comments. Following query covers above mentioned both criteria.
SELECT
source_id, `group_id`, GROUP_CONCAT(id) AS faultyIDS
FROM
table
GROUP BY
source_id,group_id
HAVING
COUNT(DISTINCT sortsequence) <> COUNT(sortsequence) OR COUNT(sortsequence) <> MAX(sortsequence) OR MIN(sortsequence) <> 1
May be it can help others.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With