The scenario is simple.
I have a somehow large MySQL db containing two tables:
-- Table 1
id (primary key) | some other columns without constraints
-----------------+--------------------------------------
1 | foo
2 | bar
3 | foobar
... | ...
-- Table 2
id_src | id_trg | some other columns without constraints
-------+--------+---------------------------------------
1 | 2 | ...
1 | 3 | ...
2 | 1 | ...
2 | 3 | ...
2 | 5 | ...
...
id
is a primary key. This table contains about 12M entries.id_src
and id_trg
are both primary keys and both have foreign key constraints on table1's id
and they also have the option DELETE ON CASCADE
enabled. This table contains about 110M entries.Ok, now what I'm doing is only to create a list of id
s that I want to remove from table 1 and then I'm executing a simple DELETE FROM table1 WHERE id IN (<the list of ids>);
The latter process is as you may have guessed would delete the corresponding id from table2 as well. So far so good, but the problem is that when I run this on a multi-threaded env and I get many Deadlocks
!
A few notes:
I have already tried SELECT ... FOR UPDATE
, but a simple DELETE
would take up to 30secs! (so there is no use of using it) :
DELETE FROM table1
WHERE id IN (
SELECT id FROM (
SELECT * FROM table1
WHERE id='some_id'
FOR UPDATE) AS x);
How can I fix this?
I would appreciate any help and thanks in advance :)
Edit:
java.util.concurrent
SELECT
s please refer to MySQL can’t specify target table for update in FROM clause
FOR UPDATE
clause is that I've heard that it locks a single row instead of locking the whole tableDELETE ON CASCADE
on id_src
and id_trg
(each separately) would mean that every delete on table1 id=x
would lead to deletes on table2 id_src=x
and id_trg=x
Some code as requested:
public void write(List data){
try{
Arraylist idsToDelete = getIdsToDelete();
String query = "DELETE FROM table1 WHERE id IN ("+ idsToDelete + " )";
mysqlJdbcTemplate.getJdbcTemplate().batchUpdate(query);
} catch (Exception e) {
LOG.error(e);
}
}
and myJdbcTemplate
is just an abstract class that extends JdbcDaoSupport
.
To reduce the possibility of deadlocks, use transactions rather than LOCK TABLES statements; keep transactions that insert or update data small enough that they do not stay open for long periods of time; when different transactions update multiple tables or large ranges of rows, use the same order of operations (such ...
5.2 Deadlock Detection. InnoDB automatically detects transaction deadlocks and rolls back a transaction or transactions to break the deadlock. InnoDB tries to pick small transactions to roll back, where the size of a transaction is determined by the number of rows inserted, updated, or deleted.
This solution is actually a way of handling MySQL Deadlocks using Spring. Show activity on this post. Another solution could be to use Spring's multiple step mechanism. So that the DELETE queries are split into 3 and thus by starting the first step by deleting the blocking column and other steps delete the two other columns respectively.
Most locks are held for short amounts of time and contention is rare, so fix only those locks that have measured contention. When using multiple locks, avoid deadlocks by making sure that all threads acquire the locks in the same order. Previous: Synchronizing Threads
To avoid deadlock, you can follow below approach to take care this issue (assuming bulk deletion will not be part of production system)- Step1: Fetch all ids by select query and keep in cursor.
Deadlocks Related to Scheduling Because there is no guaranteed order in which locks are acquired, a problem in threaded programs is that a particular thread never acquires a lock, even though it seems that it should. This usually happens when the thread that holds the lock releases it, lets a small amount of time pass, and then reacquires it.
First of all your first simple delete query in which you are passing ids, should not create problem if you are passing ids till a limit like 1000 (total no of rows in child table also should be near about but not to many like 10,000 etc.), but if you are passing like 50,000 or more then it can create locking issue.
To avoid deadlock, you can follow below approach to take care this issue (assuming bulk deletion will not be part of production system)-
Step1: Fetch all ids by select query and keep in cursor.
Step2: now delete these ids stored in cursor in a stored procedure one by one.
Note: To check why deletion is acquiring locks we have to check several things like how many ids you are passing, what is transaction level set at DB level, what is your Mysql configuration setting in my.cnf etc...
It may be dangereous to delete many (> 10000) parent records each having child records deleted by cascade, because the most records you delete in a single time, the most chances of lock conflict leading to deadlock or rollback.
If it is acceptable (meaning you can make a direct JDBC connection to the database) you should (no threading involved here) :
As the heavier job should be on database part, I hardly doubt that threading will help here. If you want to try it, I would recommend :
DELETE FROM table1 WHERE id = ?
in batchesI cannot imagine that the whole process could take more than several minutes.
After some other readings, it looks like I was used to old systems and that my numbers are really conservative.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With