Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL rationale behind "Can't specify target table for update in FROM clause"

In MySQL, I cannot reuse a table in a predicate, if I am executing a DELETE or UPDATE on the same table. For instance, this is not possible:

DELETE FROM story_category WHERE category_id NOT IN (
--          ^^^^^^^^^^^^^^ deleting from this table...

SELECT DISTINCT category.id FROM category 
       INNER JOIN story_category ON category_id=category.id);
--                ^^^^^^^^^^^^^^ ... prevents using the same table in predicates

The above example was taken from this Stack Overflow question, where the accepted answer and many other answers point out that there is a trick you can use by nesting the subquery into another subquery to trick MySQL into not recognising the table reuse:

DELETE FROM story_category
--          ^^^^^^^^^^^^^^
WHERE category_id NOT IN (
    SELECT cid FROM (
        SELECT DISTINCT category.id AS cid FROM category 
        INNER JOIN story_category ON category_id=category.id
--                 ^^^^^^^^^^^^^^ apparently no longer considered the same table
    ) AS c
)

Now, this workaround and the fact that it is even possible raise two big questions:

  • The error checking is obviously not well implemented. A simple query transformation, and MySQL no longer recognises a situation that it tries to forbid for (hopefully) a good reason.
  • Because there probably is a good reason, it probably isn't a good idea to work around this limitation.

I suspect that the reason for preventing such access in predicates is related to some guarantees that MySQL tries to establish with respect to ACID-ness. I'm afraid that if working around this limitation, I might corrupt my data or generate funny race conditions in edge-cases.

Unfortunately, the MySQL manual is not very explicit with respect to the rationale of this error:

Currently, you cannot update a table and select from the same table in a subquery.

So, my question is:

What are the reasons why MySQL disallows this very common idiom where the table from an UPDATE or DELETE statement must not be referenced in any predicates?

like image 237
Lukas Eder Avatar asked Nov 01 '22 23:11

Lukas Eder


1 Answers

The workaround is suggested in the documentation, where it is explained that it works because the data is materialized into a temporary table, as an optimizer "trick" for better performance. Since it's documented, I don't think there are any caveats in using it.

The "minimal" example of the workaround in the documentation is:

UPDATE t ... WHERE col = (SELECT * FROM (SELECT ... FROM t...) AS dt ...);

I too cannot find an explanation to why a simple UPDATE from subquery wouldn't work in MySQL, but this materialization workaround makes me think it has something to do with sorting out locking conflicts: you can't get an exclusive lock for a row UPDATE because it already has a shared lock for SELECT. When the SELECT comes from a second, "materialized" temporary table, there is no conflict.

That said, I admit this shouldn't make much sense since InnoDB documentation explains in detail that it supports tuple multiversioning and all four lock isolation levels, so this kind of operation should be supported as it is in PostgreSQL, Oracle and others.

Maybe the error message it gives out is a remaining failsafe from the MyISAM implementation since it only supports full table locks, but I wouldn't know why it should be in place for other engines.

like image 165
nunks Avatar answered Nov 08 '22 06:11

nunks