In order to trim a production database for loading in a test system, we've deleted rows in many tables. This now left us with cruft in a couple of tables, namely rows which aren't used in any FK relation anymore. What I want to achieve is like the garbage collection in Java.
Or to put it another way: If I have M tables in the database. N of them (i.e. most but not all) have foreign key relations. I've deleted a couple of high level rows (i.e. which only have outgoing FK relations) via SQL. This leaves the rows in the related tables alone.
Does someone have a SQL stored procedure or a Java program which finds the N tables and then follows all the FK relations to delete rows which are no longer needed.
If finding the N tables to too complex, I could probably provide the script a list of tables to scan or, preferably, a negative list of tables to ignore.
Also note:
A
, B
, C
, ... all use rows in Z
.This issue is addressed in the MySQL Performance blog, http://www.percona.com/blog/2011/11/18/eventual-consistency-in-mysql/
He provides the following meta query, to generate queries that will identify orphaned nodes;
SELECT CONCAT(
'SELECT ', GROUP_CONCAT(DISTINCT CONCAT(K.CONSTRAINT_NAME, '.', P.COLUMN_NAME,
' AS `', P.TABLE_SCHEMA, '.', P.TABLE_NAME, '.', P.COLUMN_NAME, '`') ORDER BY P.ORDINAL_POSITION), ' ',
'FROM ', K.TABLE_SCHEMA, '.', K.TABLE_NAME, ' AS ', K.CONSTRAINT_NAME, ' ',
'LEFT OUTER JOIN ', K.REFERENCED_TABLE_SCHEMA, '.', K.REFERENCED_TABLE_NAME, ' AS ', K.REFERENCED_TABLE_NAME, ' ',
' ON (', GROUP_CONCAT(CONCAT(K.CONSTRAINT_NAME, '.', K.COLUMN_NAME) ORDER BY K.ORDINAL_POSITION),
') = (', GROUP_CONCAT(CONCAT(K.REFERENCED_TABLE_NAME, '.', K.REFERENCED_COLUMN_NAME) ORDER BY K.ORDINAL_POSITION), ') ',
'WHERE ', K.REFERENCED_TABLE_NAME, '.', K.REFERENCED_COLUMN_NAME, ' IS NULL;'
) AS _SQL
FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE K
INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE P
ON (K.TABLE_SCHEMA, K.TABLE_NAME) = (P.TABLE_SCHEMA, P.TABLE_NAME)
AND P.CONSTRAINT_NAME = 'PRIMARY'
WHERE K.REFERENCED_TABLE_NAME IS NOT NULL
GROUP BY K.CONSTRAINT_NAME;
I converted this to find childless parents, producing;
SELECT CONCAT(
'SELECT ', GROUP_CONCAT(CONCAT(K.REFERENCED_TABLE_NAME, '.', K.REFERENCED_COLUMN_NAME) ORDER BY K.ORDINAL_POSITION), ' ',
'FROM ', K.REFERENCED_TABLE_SCHEMA, '.', K.REFERENCED_TABLE_NAME, ' AS ', K.REFERENCED_TABLE_NAME, ' ',
'LEFT OUTER JOIN ', K.TABLE_SCHEMA, '.', K.TABLE_NAME, ' AS ', K.CONSTRAINT_NAME, ' ',
' ON (', GROUP_CONCAT(CONCAT(K.CONSTRAINT_NAME, '.', K.COLUMN_NAME) ORDER BY K.ORDINAL_POSITION),
') = (', GROUP_CONCAT(CONCAT(K.REFERENCED_TABLE_NAME, '.', K.REFERENCED_COLUMN_NAME) ORDER BY K.ORDINAL_POSITION), ') ',
'WHERE ', K.CONSTRAINT_NAME, '.', K.COLUMN_NAME, ' IS NULL;'
) AS _SQL
FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE K
INNER JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE P
ON (K.TABLE_SCHEMA, K.TABLE_NAME) = (P.TABLE_SCHEMA, P.TABLE_NAME)
AND P.CONSTRAINT_NAME = 'PRIMARY'
WHERE K.REFERENCED_TABLE_NAME IS NOT NULL
GROUP BY K.CONSTRAINT_NAME;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With