Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL delete except one

Tags:

sql

mysql

I've the following database structure:

id   idproperty    idgbs
 1    1             136
 2    1             128       
 3    1             10
 4    1             1
 5    2             136
 6    2             128
 7    2             10
 8    2             1
 9    3             561
10    3             560
11    3             10
12    3             1
13    4             561
14    4             560
15    4             10
16    4             1
17    5             234
18    5             120
19    5             1
20    6             234
21    6             120
22    6             1

Here are the details:

The table refers idproperty with different geographic location. For example:

idgbs

  1 refers to United States
 10 refers to Alabama with parentid 1 (United States)
128 refers to Alabama Gulf Coast with parentid 10 (Alabama)
136 Dauphin Island with parentid 128 (Alabama Gulf Coast)

So, the structure is:

United States > Alabama > Alabama Gulf Coast > Dauphin Island

I want to delete all entries for idproperty EXCEPT the first with the set of idgbs 136, 128, 10, 1 i.e. leave atleast 1 property in all GBS and delete others.

Also, sometimes it is 4 level of geographic entries, sometimes it is 3 level.

Please share the logic & SQL query to delete all entries except one in every unique GBS.

GBS 1, 10, 128, 136 is one unique, so database should only contain 1 property id with these GBS.

After the query, the table would look like this:

id   idproperty    idgbs
 1    1             136
 2    1             128       
 3    1             10
 4    1             1
 9    3             561
10    3             560
11    3             10
12    3             1
17    5             234
18    5             120
19    5             1

Rephrasing the question:

I want to keep properties in every root level GBS i.e. there should be only ONE property in Dauphin Island.

like image 676
dang Avatar asked Apr 19 '26 20:04

dang


1 Answers

Whew... I think I understand what you are after now. I couldn't let this one go ;-)

I had to realize that in the question, you wanted property 2 deleted, because it shared a hierarchy with property 1. Once I realized that, I got the following idea. Basically, we join to an aggregated version of self twice: the first one tells us what our "gbs hierarchy path" is, and the second one matches any previous properties with the same hierarchy. Rows which find that there are no "previous" properties that share their hierarchy are spared, the rest with that hierarchy are deleted. It's possible that this could be further tweaked, but I wanted to share this now. I have tested it with the data you showed, and I got the results you posted.

DELETE 
  each_row.*
FROM property_gbs AS each_row

JOIN ( SELECT 
         idproperty, 
         GROUP_CONCAT(idgbs ORDER BY idgbs DESC SEPARATOR "/") AS idgbs_path 
       FROM property_gbs 
       GROUP BY idproperty
     ) AS mypath 
     USING(idproperty)

LEFT JOIN ( SELECT 
              idproperty, 
              GROUP_CONCAT(idgbs ORDER BY idgbs DESC SEPARATOR "/") AS idgbs_path 
            FROM property_gbs  
            GROUP BY idproperty
          ) AS previous_property 
      ON mypath.idgbs_path = previous_property.idgbs_path 
  AND previous_property.idproperty < each_row.idproperty

WHERE previous_property.idproperty

Note that the last line is not a typo, we are just checking if there is a previous property with the same path. If there is, then delete the currently-evaluated row.

Cheers!

note for clarification

The thought here is to associate every row with it's hierarchy, even if it's a row which represents somewhere in the middle of the hierarchy (such as row: {2, 1, 128} in the question). With the first join to the aggregate, each row now "knows" what it's path is (so that row would get "136/128/10/1"). We can then use that value in the second join to find other properties with the same path, but only if they have a LOWER property id. This allows us to check for the existence of a lower-ID property with the same "path", and delete any row which represents a property which does have such a "lower-order path-sibling".

like image 192
Chris Trahey Avatar answered Apr 21 '26 10:04

Chris Trahey