I'd checked other similar kind of issues e.g. "deadlock in MySQL" in stack overflow but nothing leads to the solution.
REPLACE INTO db2.table2 (id, some_identifier_id, name, created_at, updated_at) (SELECT id, some_identifier_id, name, created_at, updated_at FROM db1.table1 WHERE some_identifier_id IS NOT NULL AND some_identifier_id NOT IN (SELECT some_identifier_id FROM db2.table1 WHERE some_other_identifier_id IS NOT NULL));
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Situation:
Tried:
The most probable reason could be due to multi-master replication behind the galera cluster for its optimistic locking (http://www.severalnines.com/blog/avoiding-deadlocks-galera-set-haproxy-single-node-writes-and-multi-node-reads). But that should not fail when executing the query on an individual node? Though on success I've to execute the same in that multi-master replication but I guess if the basic issue is solved then replicated servers won't create issue anymore.
Note:
I need to do this without any temp table or storing the sub query's result in code. There are some other dependencies for which executing a single query is the most favorable way till now.
Okay, I found a workaround to this. As per my research and tests, I think there are 2 issues behind this failure.
Do not rely on auto-increment values to be sequential. Galera uses a mechanism based on autoincrement increment to produce unique non-conflicting sequences, so on every single node the sequence will have gaps. https://mariadb.com/kb/en/mariadb/mariadb-galera-cluster-known-limitations/
Galera Cluster uses at the cluster-level optimistic concurrency control, which can result in transactions that issue a COMMIT aborting at that stage. http://galeracluster.com/documentation-webpages/limitations.html
In a gist- query was running successfully in an individual server but when it's galera then the failure comes. Removal of the auto-incremental primary key from that query and handling the same transaction to restart on deadlock solved the problem.
[Edit]
I've written a post to explain the schema, environment, issue and how I worked with it. May be useful to someone facing the same issue.
The issue is reported to community and open
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With