The scenario is simple. I have a somehow large MySQL db containing two tables: <pre class="prettyprint"><code>-- Table 1 id (primary key) | some other columns without constraints -----------------+-------------------------------------- 1 | foo 2 | bar 3 | foobar ... | ... -- Table 2 id_src | id_trg | some other columns without constraints -------+--------+--------------------------------------- 1 | 2 | ... 1 | 3 | ... 2 | 1 | ... 2 | 3 | ... 2 | 5 | ... ... </code></pre> <ul> <li>On table1 only <code>id</code> is a primary key. This table contains about 12M entries.</li> <li>On table2 <code>id_src</code> and <code>id_trg</code> are both primary keys and both have foreign key constraints on table1's <code>id</code> and they also have the option <code>DELETE ON CASCADE</code> enabled. This table contains about 110M entries.</li> </ul> Ok, now what I'm doing is only to create a list of <code>id</code>s that I want to remove from table 1 and then I'm executing a simple <code>DELETE FROM table1 WHERE id IN (<the list of ids>);</code> The latter process is as you may have guessed would delete the corresponding id from table2 as well. So far so good, but the problem is that when I run this on a multi-threaded env and I get many <code>Deadlocks</code>! A few notes: <ul> <li>There is no other process running at the same time nor will be (for the time being)</li> <li>I want this to be fast! I have about 24 threads (if this does make any difference in the answer)</li> <li>I have already tried almost all of transaction isolation levels (except the TRANSACTION_NONE) Java sql connection transaction isolation </li> <li>Ordering/sorting the id's I think would not help!</li> <li> I have already tried <code>SELECT ... FOR UPDATE</code>, but a simple <code>DELETE</code> would take up to 30secs! (so there is no use of using it) : <pre class="prettyprint"><code>DELETE FROM table1 WHERE id IN ( SELECT id FROM ( SELECT * FROM table1 WHERE id='some_id' FOR UPDATE) AS x); </code></pre> </li> </ul> How can I fix this? I would appreciate any help and thanks in advance :) Edit: <ol> <li>Using InnoDB engine</li> <li>On a single thread this process would take a dozen hours even maybe a whole day, but I'm aiming for a few hours!</li> <li>I'm already using a connection pool manager: <code>java.util.concurrent</code> </li> <li>For explanation on double nested <code>SELECT</code>s please refer to MySQL can’t specify target table for update in FROM clause </li> <li>The list that is to be deleted from DB, may contain a couple of million entries in total which is divided into chunks of 200</li> <li>The <code>FOR UPDATE</code> clause is that I've heard that it locks a single row instead of locking the whole table</li> <li>The app uses Spring's batchUpdate(String sqlQuery) method, thus the transactions are managed automatically</li> <li>All ids have index enabled and the ids are unique 50 chars max!</li> <li> <code>DELETE ON CASCADE</code> on <code>id_src</code> and <code>id_trg</code> (each separately) would mean that every delete on table1 <code>id=x</code> would lead to deletes on table2 <code>id_src=x</code> and <code>id_trg=x</code> </li> <li> Some code as requested: <pre class="prettyprint"><code>public void write(List data){ try{ Arraylist idsToDelete = getIdsToDelete(); String query = "DELETE FROM table1 WHERE id IN ("+ idsToDelete + " )"; mysqlJdbcTemplate.getJdbcTemplate().batchUpdate(query); } catch (Exception e) { LOG.error(e); } } </code></pre> </li> </ol> and <code>myJdbcTemplate</code> is just an abstract class that extends <code>JdbcDaoSupport</code>.

It may be dangereous to delete many (> 10000) parent records each having child records deleted by cascade, because the most records you delete in a single time, the most chances of lock conflict leading to deadlock or rollback. If it is acceptable (meaning you can make a direct JDBC connection to the database) you should (no threading involved here) : <ul> <li>compute the list of ids to delete</li> <li>delete them by batches (between 10 and 100 a priori) committing every 100 or 1000 records</li> </ul> As the heavier job should be on database part, I hardly doubt that threading will help here. If you want to try it, I would recommend : <ul> <li>one single thread (with a dedicated database connection) computing the list of ids to delete and alimenting a synchronized queue with them</li> <li>a small number of threads (4 maybe 8), each with its own database connection that : <ul> <li>use a prepared <code>DELETE FROM table1 WHERE id = ?</code> in batches</li> <li>take ids from the queue and prepare the batches</li> <li>send a batch to the database every 10 or 100 records</li> <li>do a commit every 10 or 100 batches</li> </ul> </li> </ul> I cannot imagine that the whole process could take more than several minutes. After some other readings, it looks like I was used to old systems and that my numbers are really conservative.

Avoiding MySQL Deadlocks in a multithreaded Spring app

Tags:

java

mysql

spring

multithreading

The scenario is simple.
I have a somehow large MySQL db containing two tables:

-- Table 1
id (primary key) | some other columns without constraints
-----------------+--------------------------------------
       1         |       foo
       2         |       bar
       3         |       foobar
      ...        |       ...

-- Table 2
id_src | id_trg | some other columns without constraints
-------+--------+---------------------------------------
   1   |   2    |    ...
   1   |   3    |    ...
   2   |   1    |    ...
   2   |   3    |    ...
   2   |   5    |    ...
   ...

On table1 only id is a primary key. This table contains about 12M entries.
On table2 id_src and id_trg are both primary keys and both have foreign key constraints on table1's id and they also have the option DELETE ON CASCADE enabled. This table contains about 110M entries.

Ok, now what I'm doing is only to create a list of ids that I want to remove from table 1 and then I'm executing a simple DELETE FROM table1 WHERE id IN (<the list of ids>);

The latter process is as you may have guessed would delete the corresponding id from table2 as well. So far so good, but the problem is that when I run this on a multi-threaded env and I get many Deadlocks!

A few notes:

There is no other process running at the same time nor will be (for the time being)
I want this to be fast! I have about 24 threads (if this does make any difference in the answer)
I have already tried almost all of transaction isolation levels (except the TRANSACTION_NONE) Java sql connection transaction isolation
Ordering/sorting the id's I think would not help!

I have already tried SELECT ... FOR UPDATE, but a simple DELETE would take up to 30secs! (so there is no use of using it) :

DELETE FROM table1 
WHERE id IN ( 
    SELECT id FROM (
        SELECT * FROM table1 
        WHERE id='some_id' 
        FOR UPDATE) AS x);

How can I fix this?

I would appreciate any help and thanks in advance :)

Edit:

Using InnoDB engine
On a single thread this process would take a dozen hours even maybe a whole day, but I'm aiming for a few hours!
I'm already using a connection pool manager: java.util.concurrent
For explanation on double nested SELECTs please refer to MySQL can’t specify target table for update in FROM clause
The list that is to be deleted from DB, may contain a couple of million entries in total which is divided into chunks of 200
The FOR UPDATE clause is that I've heard that it locks a single row instead of locking the whole table
The app uses Spring's batchUpdate(String sqlQuery) method, thus the transactions are managed automatically
All ids have index enabled and the ids are unique 50 chars max!
DELETE ON CASCADE on id_src and id_trg (each separately) would mean that every delete on table1 id=x would lead to deletes on table2 id_src=x and id_trg=x

Some code as requested:

public void write(List data){
    try{
        Arraylist idsToDelete = getIdsToDelete();
        String query = "DELETE FROM table1 WHERE id IN ("+ idsToDelete + " )";
        mysqlJdbcTemplate.getJdbcTemplate().batchUpdate(query);
       } catch (Exception e) {
           LOG.error(e);
       }
}

and myJdbcTemplate is just an abstract class that extends JdbcDaoSupport.

641

asked Jun 18 '15 12:06

Hamed

2 Answers

First of all your first simple delete query in which you are passing ids, should not create problem if you are passing ids till a limit like 1000 (total no of rows in child table also should be near about but not to many like 10,000 etc.), but if you are passing like 50,000 or more then it can create locking issue.

To avoid deadlock, you can follow below approach to take care this issue (assuming bulk deletion will not be part of production system)-

Step1: Fetch all ids by select query and keep in cursor.

Step2: now delete these ids stored in cursor in a stored procedure one by one.

Note: To check why deletion is acquiring locks we have to check several things like how many ids you are passing, what is transaction level set at DB level, what is your Mysql configuration setting in my.cnf etc...

144

answered Oct 20 '22 18:10

Zafar Malik

It may be dangereous to delete many (> 10000) parent records each having child records deleted by cascade, because the most records you delete in a single time, the most chances of lock conflict leading to deadlock or rollback.

If it is acceptable (meaning you can make a direct JDBC connection to the database) you should (no threading involved here) :

compute the list of ids to delete
delete them by batches (between 10 and 100 a priori) committing every 100 or 1000 records

As the heavier job should be on database part, I hardly doubt that threading will help here. If you want to try it, I would recommend :

one single thread (with a dedicated database connection) computing the list of ids to delete and alimenting a synchronized queue with them
a small number of threads (4 maybe 8), each with its own database connection that :
- use a prepared DELETE FROM table1 WHERE id = ? in batches
- take ids from the queue and prepare the batches
- send a batch to the database every 10 or 100 records
- do a commit every 10 or 100 batches

I cannot imagine that the whole process could take more than several minutes.

After some other readings, it looks like I was used to old systems and that my numbers are really conservative.

answered Oct 20 '22 16:10

Serge Ballesta

Related questions
                            
                                Java Spring how to force https ssl in webapplicationinitializer?
                            
                                Modified Collections.copy
                            
                                How to use the same Scanner across multiple classes in Java
                            
                                Dagger with Android: How to inject context when using MVP?
                            
                                Spring RestController : reject request with unknown fields
                            
                                What heuristic uses TPL to determine when to use multiple cores
                            
                                oAuth2 client with password grant in Spring Security
                            
                                Sending email without user Interaction - Android Studio
                            
                                SpringMVC - dispatcher servler's url pattern style
                            
                                How to select between the elements having a certain priority list?
                            
                                Multiple Level of Inheritance
                            
                                ExecutorService that scales threads then queues tasks
                            
                                Getting median picture from sequence of images with OpenCV
                            
                                How to use set of elements as key in java maps?
                            
                                Is there a way to define a generic parameter type explicitly in a lambda expression?
                            
                                Why JUnit Testing exception always fail? [duplicate]
                            
                                Use Java stream API to summarize data like SQL GROUP BY
                            
                                What is the difference between wait/notify and wait/interrupt?
                            
                                Spring Data Hibernate + Pageable: Returns empty results
                            
                                The generated class for Component of Dagger 2 is not found in compileTestJava of Gradle's Java Plugin

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With