Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server, Converting NTEXT to NVARCHAR(MAX)

I have a database with a large number of fields that are currently NTEXT.

Having upgraded to SQL 2005 we have run some performance tests on converting these to NVARCHAR(MAX).

If you read this article:

http://geekswithblogs.net/johnsPerfBlog/archive/2008/04/16/ntext-vs-nvarcharmax-in-sql-2005.aspx

This explains that a simple ALTER COLUMN does not re-organise the data into rows.

I experience this with my data. We actually have much worse performance in some areas if we just run the ALTER COLUMN. However, if I run an UPDATE Table SET Column = Column for all of these fields we then get an extremely huge performance increase.

The problem I have is that the database consists of hundreds of these columns with millions of records. A simple test (on a low performance virtual machine) had a table with a single NTEXT column containing 7 million records took 5 hours to update.

Can anybody offer any suggestions as to how I can update the data in a more efficient way that minimises downtime and locks?

EDIT: My backup solution is to just update the data in blocks over time, however, with our data this results in worse performance until all the records have been updated and the shorter this time is the better so I'm still looking for a quicker way to update.

like image 875
Robin Day Avatar asked Jan 28 '09 11:01

Robin Day


2 Answers

If you can get scheduled downtime:

  1. Back up the database
  2. Change recovery model to simple
  3. Remove all indexes from the table you are updating
  4. Add a column maintenanceflag(INT DEFAULT 0) with a nonclustered index
  5. Run: UPDATE TOP 1000 tablename SET nvarchar from ntext, maintenanceflag = 1 WHERE maintenanceflag = 0

Multiple times as required (within a loop with a delay).

Once complete, do another backup then change the recovery model back to what it was originally on and add old indexes.

Remember that every index or trigger on that table causes extra disk I/O and that the simple recovery mode minimises logfile I/O.

like image 194
John Avatar answered Nov 17 '22 09:11

John


How about running the update in batches - update 1000 rows at a time.

You would use a while loop that increments a counter, corresponding to the ID of the rows to be updated in each iteration of the the update query. This may not speed up the amount of time it takes to update all 7 million records, but it should make it much less likely that users will experience an error due to record locking.

like image 3
Yaakov Ellis Avatar answered Nov 17 '22 09:11

Yaakov Ellis