I have a database in production with one table that has grown extremely large (lots of accumulated data).
To improve query performance I used the sql server optimizer which suggested a new index.
So I made a copy of the production database to test against and it does improve performance, however my problem is that it took about 24 hours to create the index and while the index is being created the application is unusable.
For this particular application, being down for a few hours is not a problem but a 24 hour downtime would be and I am looking for a way to create this index without having to do that.
I only have a few ideas at the moment.
One idea is to copy a backup to another server. Apply the new index and any other changes. Copy the backup back to the production server. Take the application down and merge over any new data since when I took the backup.
Of course this has its own set of problems like having to merge the data back together so I don't like this idea for that reason.
This is SQL Server 2008 Standard Ed.
I normally deploy database changes by script.
UPDATE: Another idea would be to move the archive data out of the main table over several days in chunks. Then create the index when the table got small enough. Then slowly migrate the data back.
Creating indexes on a very large table takes over 5 hours.
By changing the number of processors SQL Server can use in parallel, in other words the maximum degree of parallelism (MAXDOP), we can improve index rebuild performance. This option is by default set to zero instance-wide on SQL Server, it does not mean use zero processors.
If you were using Enterprise, you could use the ONLINE
option of CREATE INDEX
that builds the index without keeping long-term locks on the table. There are caveats around its use; see the linked article for details, and you might find performance impact to be too great. But it's academic as you've said you're using standard (sorry for missing that at first).
The fact it's a VM immediately makes one think in terms of temporarily "pumping up" the VM or even temporarily relocating to a maxed-out non-VM. For rebuilding an index on a very large table, I'd think RAM and I/O speed would be the biggest factors; is the VM using a drive directly or a virtualized drive? Can you temporarily relocate the data to a physical drive? That sort of thing.
FWIW, your take-it-offline-and-do-it idea is exactly what I'd do on a MySQL database (never had to on an SQL Server database): Take the main DB down, grab a snapshot, clear the binlogs/enable binlogging, and fire it back up. Make the index on a separate machine. When ready, take the DB down, make a backup of the updated DB (just in case), put back the snapshot, apply the binlogs, and bring the DB back up. It really is that easy; I expect you can do that with SQL Server as well. Of course, it does assume that you can apply 24 hours of binlogs against the (newly optimized) table within an acceptable time window!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With