Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server ALTER field NOT NULL takes forever

I want to alter a field from a table which has about 4 million records. I ensured that all of these fields values are NOT NULL and want to ALTER this field to NOT NULL

ALTER TABLE dbo.MyTable
ALTER COLUMN myColumn int NOT NULL

... seems to take forever to do this update. Any ways to speed it up or am I stuck just doing it overnight during off-hours?

Also could this cause a table lock?

like image 530
Chris Klepeis Avatar asked Jul 20 '09 13:07

Chris Klepeis


People also ask

How do I change a column to null not null in SQL Server?

All you need to do is to replace [Table] with the name of your table, [Col] with the name of your column and TYPE with the datatype of the column. Execute the command and you are allowed to use NULL values for the specified column. That is all it takes to switch between NULL and NOT NULL .

How do I drop a not null constraint in SQL?

To remove a NOT NULL constraint for a column in SQL Server, you use the ALTER TABLE .... ALTER COLUMN command and restate the column definition.


2 Answers

You can alter a field and make it not null without it checking the fields. If you are really concerned about not doing it off hours you can add a constraint to the field which checks to make sure it isn't null instead. This will allow you to use the with no check option, and not have it check each of the 4 million rows to see if it updates.

CREATE TABLE Test
(
    T0 INT Not NULL,
    T1 INT NUll 
)

INSERT INTO Test VALUES(1, NULL) -- Works!

ALTER TABLE Test
    WITH NOCHECK
        ADD CONSTRAINT N_null_test CHECK (T1 IS NOT NULL)

    ALTER COLUMN T1 int NOT NULL 

INSERT INTO Test VALUES(1, NULL) -- Doesn't work now!

Really you have two options (added a third one see edit):

  1. Use the constraint which will prevent any new rows from being updated and leave the original ones unaltered.
  2. Update the rows which are null to something else and then apply the not null alter option. This really should be run in off hours, unless you don't mind processes being locked out of the table.

Depending on your specific scenario, either option might be better for you. I wouldn't pick the option because you have to run it in off hours though. In the long run, the time you spend updating in the middle of the night will be well spent compared the headaches you'll possibly face by taking a short cut to save a couple of hours.

This all being said, if you are going to go with option two you can minimize the amount of work you do in off hours. Since you have to make sure you update the rows to not null before altering the column, you can write a cursor to slowly (relative to doing it all at once)

  1. Go through each row
  2. Check to see if it is null
  3. Update it appropriately. This will take a good while, but it won't lock the whole table block other programs from accessing it. (Don't forget the with(rowlock) table hint!)

EDIT: I just thought of a third option: You can create a new table with the appropriate columns, and then export the data from the original table to the new one. When this is done, you can then drop the original table and change the name of the new one to be the old one. To do this you'll have to disable the dependencies on the original and set them back up on the new one when you are done, but this process will greatly reduce the amount of work you have to do in the off hours. This is the same approach that sql server uses when you make column ordering changes to tables through the management studio. For this approach, I would do the insert in chunks to make sure that you don't cause undo stress on the system and stop others from accessing it. Then on the off hours, you can drop the original, rename the second, and apply dependencies etc. You'll still have some off hours work, but it will be minuscule compared to the other approach.

Link to using sp_rename.

like image 72
kemiller2002 Avatar answered Oct 01 '22 20:10

kemiller2002


The only way to do this "quickly" (*) that I know of is by

  • creating a 'shadow' table which has the required layout
  • adding a trigger to the source-table so any insert/update/delete operations are copied to the shadow-table (mind to catch any NULL's that might popup!)
  • copy all the data from the source to the shadow-table, potentially in smallish chunks (make sure you can handle the already copied data by the trigger(s), make sure the data will fit in the new structure (ISNULL(?) !)
  • script out all dependencies from / to other tables
  • when all is done, do the following inside an explicit transaction :
    • get an exclusive table lock on the source-table and one on the shadowtable
    • run the scripts to drop dependencies to the source-table
    • rename the source-table to something else (eg suffix _old)
    • rename the shadow table to the source-table's original name
    • run the scripts to create all the dependencies again

You might want to do the last step outside of the transaction as it might take quite a bit of time depending on the amount and size of tables referencing this table, the first steps won't take much time at all

As always, it's probably best to do a test run on a test-server first =)

PS: please do not be tempted to recreate the FK's with NOCHECK, it renders them futile as the optimizer will not trust them nor consider them when building a query plan.

(*: where quickly comes down to : with the least possible downtime)

like image 28
deroby Avatar answered Oct 01 '22 19:10

deroby