I'm having the following problem using Django with MySQL 5.5.22.
Given a table with columns id, level and a 2x2 matrix stored as a11, a12, a21, a22, I have this row:
id a11 a12 a21 a22 level
324 3 2 5 3 2
Given a queryset qs, I do the following update:
qs.update(
a11=(b12 * a21 - b11 * a22) * F('a11') + (b11 * a12 - b12 * a11) * F('a21'),
a12=(b12 * a21 - b11 * a22) * F('a12') + (b11 * a12 - b12 * a11) * F('a22'),
a21=(b22 * a21 - b21 * a22) * F('a11') + (b21 * a12 - b22 * a11) * F('a21'),
a22=(b22 * a21 - b21 * a22) * F('a12') + (b21 * a12 - b22 * a11) * F('a22'),
level=(F('level') - 1)
)
For which django generates the following query (got it from db.connection.queries, remove the where clause for brevity):
UPDATE `storage`
SET
`a21` = (3 * `storage`.`a11`) + (-1 * `storage`.`a21`),
`a22` = (3 * `storage`.`a12`) + (-1 * `storage`.`a22`),
`level` = `storage`.`level` - -1,
`a11` = (2 * `storage`.`a11`) + (-1 * `storage`.`a21`),
`a12` = (2 * `storage`.`a12`) + (-1 * `storage`.`a22`)
And my row looks like this after that:
id a11 a12 a21 a22 level
324 2 1 4 3 1
For any row, a12*a21 - a11*a22 = 1
is supposed to be True, and according to that, the row was supposed to be:
id a11 a12 a21 a22 level
324 1 1 4 3 1
This is what I get on SQLite, with Django generating the same query, and it took me a lot of time to figure that MySQL was doing something different. From the query, it seems like when updating interdepent multiple rows, MySQL doesn't treat it as a single atomic operation, and as columns are updated, they affect the values dependent on them. I confirmed this seems to be what happens by the following code on the Python prompt:
>>> a11, a12, a21, a22 = (3, 2, 5, 3)
>>> (2 * a11) + (-1 * a21),\
... (2 * a12) + (-1 * a22),\
... (3 * a11) + (-1 * a21),\
... (3 * a12) + (-1 * a22)
(1, 1, 4, 3)
If columns are updated one at a time, in the same order given by the query:
>>> a11, a12, a21, a22 = (3, 2, 5, 3)
>>> a21 = (3*a11) + (-1*a21)
>>> a22 = (3*a12) + (-1*a22)
>>> a11 = (2*a11) + (-1*a21)
>>> a12 = (2*a12) + (-1*a22)
>>> (a11, a12, a21, a22)
(2, 1, 4, 3)
This is really scary behavior, since this is a library meant to be used cross-platform. My questions are:
edit
The problem is clear, but I'm still looking for a solution. Pulling all values and pushing them back is not an acceptable solution for this particular application.
PostgreSQL, Oracle, and SQL Server all treat this as an atomic operation. See the following SQL Fiddle, and switch the server to see the behavior of the following SQL:
CREATE TABLE Swap (
a CHAR(1),
b CHAR(1)
);
INSERT INTO Swap (a, b) VALUES ('a', 'b');
UPDATE Swap SET a = b, b = a;
SELECT * FROM Swap;
MySQL was the only RBDMS that implements this with both columns containing the same value after the update.
As far as how you would resolve this, I would instead pull the values from the database, do the calculations inside of your application (instead of your update statement), and then update the database with the calculated values. This way you can guarantee that the calculation will be performed in a consistent manner.
As stated in the MySQL manual:
The second assignment in the following statement sets
col2
to the current (updated)col1
value, not the originalcol1
value. The result is thatcol1
andcol2
have the same value. This behavior differs from standard SQL.UPDATE t1 SET col1 = col1 + 1, col2 = col1;
Therefore, in your case, the value being used for a21
when evaluating the expression `a11` = (2 * `storage`.`a11`) + (-1 * `storage`.`a21`)
is the new, updated, value of 4 rather than the original value of 5. As the manual says, this behaviour differs from standard SQL.
You could instead use a self-join with the multiple-table UPDATE
syntax, however I don't know whether something like this can be implemented using the Django ORM:
UPDATE storage AS old
JOIN storage AS new USING (id)
SET
new.a21 = (3 * old.a11) + (-1 * old.a21),
new.a22 = (3 * old.a12) + (-1 * old.a22),
new.level = old.level - -1,
new.a11 = (2 * old.a11) + (-1 * old.a21),
new.a12 = (2 * old.a12) + (-1 * old.a22);
See it on sqlfiddle.
My only other thought (which should definitely be implementable within Django) is to split the update into separate parts, defining the fields updated in later parts in relation to the new (rather than the old) values of those fields which have been updated in earlier parts:
UPDATE storage
SET a21 = (3 * a11) + (-1 * a21),
a22 = (3 * a12) + (-1 * a22),
level = level - -1;
UPDATE storage
SET a11 = (2 * a11) + (-1 * (3*a11 - a21)),
a12 = (2 * a12) + (-1 * (3*a12 - a22));
To prevent concurrency issues, you ought to perform these two updates within a transaction (if supported by the RDBMS).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With