Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to deploy new index to very large table in SQL Server 2008

Tags:

I have a database in production with one table that has grown extremely large (lots of accumulated data).

To improve query performance I used the sql server optimizer which suggested a new index.

So I made a copy of the production database to test against and it does improve performance, however my problem is that it took about 24 hours to create the index and while the index is being created the application is unusable.

For this particular application, being down for a few hours is not a problem but a 24 hour downtime would be and I am looking for a way to create this index without having to do that.

I only have a few ideas at the moment.

One idea is to copy a backup to another server. Apply the new index and any other changes. Copy the backup back to the production server. Take the application down and merge over any new data since when I took the backup.

Of course this has its own set of problems like having to merge the data back together so I don't like this idea for that reason.

This is SQL Server 2008 Standard Ed.

I normally deploy database changes by script.

UPDATE: Another idea would be to move the archive data out of the main table over several days in chunks. Then create the index when the table got small enough. Then slowly migrate the data back.

like image 888
Zack Avatar asked Feb 22 '10 09:02

Zack


People also ask

How long it will take to CREATE INDEX on a large table?

Creating indexes on a very large table takes over 5 hours.

How do I speed up index rebuild in SQL Server?

By changing the number of processors SQL Server can use in parallel, in other words the maximum degree of parallelism (MAXDOP), we can improve index rebuild performance. This option is by default set to zero instance-wide on SQL Server, it does not mean use zero processors.


1 Answers

If you were using Enterprise, you could use the ONLINE option of CREATE INDEX that builds the index without keeping long-term locks on the table. There are caveats around its use; see the linked article for details, and you might find performance impact to be too great. But it's academic as you've said you're using standard (sorry for missing that at first).

The fact it's a VM immediately makes one think in terms of temporarily "pumping up" the VM or even temporarily relocating to a maxed-out non-VM. For rebuilding an index on a very large table, I'd think RAM and I/O speed would be the biggest factors; is the VM using a drive directly or a virtualized drive? Can you temporarily relocate the data to a physical drive? That sort of thing.

FWIW, your take-it-offline-and-do-it idea is exactly what I'd do on a MySQL database (never had to on an SQL Server database): Take the main DB down, grab a snapshot, clear the binlogs/enable binlogging, and fire it back up. Make the index on a separate machine. When ready, take the DB down, make a backup of the updated DB (just in case), put back the snapshot, apply the binlogs, and bring the DB back up. It really is that easy; I expect you can do that with SQL Server as well. Of course, it does assume that you can apply 24 hours of binlogs against the (newly optimized) table within an acceptable time window!

like image 103
T.J. Crowder Avatar answered Sep 21 '22 06:09

T.J. Crowder