How to chunk updates to SQL Server?

Tags:

sql-server

I want to update a table in SQL Server by setting a FLAG column to 1 for all values since the beginning of the year:

TABLE
DATE        ID     FLAG   (more columns...)
2016/01/01  1      0      ...
2016/01/01  2      0      ...
2016/01/02  3      0      ...
2016/01/02  4      0      ...
(etc)

Problem is that this table contains hundreds of millions of records and I've been advised to chunk the updates 100,000 rows at a time to avoid blocking other processes.

I need to remember which rows I update because there are background processes which immediately flip the FLAG back to 0 once they're done processing it.

Does anyone have suggestions on how I can do this? Each day's worth of data has over a million records, so I can't simply loop using the DATE as a counter. I am thinking of using the ID

680

asked Jun 09 '16 18:06

2 Answers

Assuming the date column and the ID column are sequential you could do a simple loop. By this I mean that if there is a record id=1 and date=2016-1-1 then record id=2 date=2015-12-31 could not exist. If you are worried about locks/exceptions you should add a transaction in the WHILE block and commit or rollback on failure.

Change the @batchSize to whatever you feel is right after some experimentation.

DECLARE @currentId int, @maxId int, @batchSize int = 10000

SELECT @currentId = MIN(ID), @maxId = MAX(ID) FROM YOURTABLE WHERE DATE >= '2016-01-01'

WHILE @currentId < @maxId
BEGIN
    UPDATE YOURTABLE SET FLAG = 1 WHERE ID BETWEEN @currentId AND (@currentId + @batchSize)
    SET @currentId = @currentId + @batchSize
END

As this as the update will never flag the same record to 1 twice I do not see a need to track which records were touched unless you are going to manually stop the process partway through.

You should also ensure that the ID column has an index on it so the retrieval is fast in each update statement.

134

answered Nov 15 '22 03:11

Igor

Looks like very simple question or maybe I'm missing something in it.

You can create a temp/permanent table to keep track of updated rows.

create tbl (Id int) -- or temp table based on your case
insert into tbl values (0)

declare @lastId int = (select Id from tbl)

;with cte as (
    select top 100000 
    from YourMainTable
    where Id > @lastId
    ORDER BY Id
)
update cte 
set Flag = 1

update tbl set Id = @lastId + 100000

You can do this process in a loop (except the table creation part)

answered Nov 15 '22 04:11

FLICKER

Related questions
                            
                                How to fast reduce a datatable according to existing entries in a DB
                            
                                Select rows per N and do a SUM
                            
                                Microsoft Server studio 'script table as' for INSERT to bad syntax
                            
                                How to make group by query faster?
                            
                                Windows Authentication With SQLAlchemy and pymssql
                            
                                How to send and receive parameters to/from SQL Server stored procedure
                            
                                Writing SQL Query to query and update same table
                            
                                Entity Framework and Different Environments (Dev/Production)
                            
                                Passing Nullable Integer (int? SomeValue = NULL) As Parameter To Stored Procedure
                            
                                How to start a transaction in ODBC?
                            
                                Login failed when querying linked server
                            
                                Does "Join" order will lead to different query performance [duplicate]
                            
                                Default sort order for a select query in SQL Server 2005 and SQL Server 2012
                            
                                SQL Argument data type int is invalid for argument 1 of charindex function
                            
                                How to load grouped data with SSIS
                            
                                Invalid Column Name using Dynamic SQL
                            
                                How to check for available stock and allocate them to an order using a view in SQL Server
                            
                                Get records via foreign key in another table in Entity Framework 6
                            
                                Converting Hex to Float SQL
                            
                                MSSQL Linked Server error: The OLE DB provider "OraOLEDB.Oracle" for linked server supplied inconsistent metadata for a column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With