Assume I have 10,000 rows that need to be updated. What would be faster
UPDATE DB.Servers SET Live = 1 where name = 'server1';
UPDATE DB.Servers SET Live = 1 where name = 'server2';
...
UPDATE DB.Servers SET Live = 1 where name = 'server100000';
OR
UPDATE DB.Servers SET Live = 1 where name in ('server1', 'server2'...'server10000');
I would assume the second option is faster, but I'm not sure. What worries me is that I don't know if there is a length limit for an SQL stm. What would be recommended in this type of situation?
Thank you
Best practices to improve SQL update statement performance We need to consider the lock escalation mode of the modified table to minimize the usage of too many resources. Analyzing the execution plan may help to resolve performance bottlenecks of the update query. We can remove the redundant indexes on the table.
DECLARE @Rows INT, @BatchSize INT; -- keep below 5000 to be safe SET @BatchSize = 2000; SET @Rows = @BatchSize; -- initialize just to enter the loop BEGIN TRY WHILE (@Rows = @BatchSize) BEGIN UPDATE TOP (@BatchSize) tab SET tab. Value = 'abc1' FROM TableName tab WHERE tab. Parameter1 = 'abc' AND tab.
The update performance, just like insert and delete , also depends on the number of indexes on the table. The only difference is that update statements do not necessarily affect all columns because they often modify only a few selected columns.
In general, multiple update queries in a single transaction will be faster than a single query, but there are many factors that can affect performance. Some of the things that can affect performance include the size and complexity of the data being updated, the type of database engine being used, and the configuration of the server.
In SQL Server we use the UPDATE statement for modifying data. Updating data can be done in various ways such as row by row, one big batch or in several smaller batches. In this tip we will look at the differences to perform UPDATEs using these methods. First, we'll setup a SQL Server database table in order to run UPDATE operations.
It’s a faster update than a row by row operation, but this is best used when updating limited rows. A bulk update is an expensive operation in terms of query cost, because it takes more resources for the single update operation.
Updating very large tables can be a time taking task and sometimes it might take hours to finish. Here are few tips for SQL Server Optimizing the updates on large data volumes. Removing index on the column to be updated. Executing the update in smaller batches. Disabling Delete triggers. Replacing Update statement with a Bulk-Insert operation.
The single UPDATE
is faster.
I have tested with MySQL 5.1.73
CREATE TABLE test_random (
val char(40) NOT NULL default '',
num int NOT NULL default '0',
KEY val (val)
) TYPE=MyISAM;
INSERT INTO test_random (val, num) VALUES
('MXZJBXUGNFOZMMQMYZEMLKZZKTCIGEU',889),
('ZTEBMDHOJGYBYEOPZIIPPJQQOKXMTKU',351),
... [200K records total inserted] ...
('ADLDYZQHDEZMYBHORKGJYMIOVUETQCM',786);
Then here is random-update-single.sql:
UPDATE test_random SET num=1 WHERE val IN (
'PXTUKCZMRFZDTWUPULAPENPNQCSPFQJ',
'GDIMLSCDRSNCMUNUZLQIDFZSEELNZLR',
... [100K records] ...
'ADLDYZQHDEZMYBHORKGJYMIOVUETQCM');
And here is random-update-multiple.sql:
UPDATE test_random SET num=2 WHERE val='PXTUKCZMRFZDTWUPULAPENPNQCSPFQJ';
UPDATE test_random SET num=2 WHERE val='GDIMLSCDRSNCMUNUZLQIDFZSEELNZLR';
... [100K records] ...
UPDATE test_random SET num=2 WHERE val='ADLDYZQHDEZMYBHORKGJYMIOVUETQCM';
Here is the result:
> time mysql -uroot test < random-update-single.sql
0.075u 0.009s 0:01.78 3.9% 0+0k 0+0io 0pf+0w
> time mysql -uroot test < random-update-single.sql
0.074u 0.009s 0:01.76 3.9% 0+0k 0+0io 0pf+0w
> time mysql -uroot test < random-update-single.sql
0.069u 0.013s 0:01.57 4.4% 0+0k 0+0io 0pf+0w
> time mysql -uroot test < random-update-multiple.sql
1.746u 1.515s 0:11.14 29.1% 0+0k 0+0io 0pf+0w
> time mysql -uroot test < random-update-multiple.sql
2.183u 2.150s 0:14.83 29.1% 0+0k 0+0io 0pf+0w
> time mysql -uroot test < random-update-multiple.sql
1.961u 1.949s 0:13.96 27.9% 0+0k 0+0io 0pf+0w
That is, multiple UPDATE
turned out to be 5-6 times slower than single UPDATE
.
SQL is supposed to be a declarative language; it does not expect from the user to say "how" to get the result, only "what" the desired result is. So in principal I would use the in()
construct, as this is the most concise (from a logical viewpoint) way to ask for the results, and let the DBMS (any DBMS!) decide what's best.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With