Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sql Server - Drop column from VLT (Very Large Table )

Can anyone please advise what is the best to achieve the below:

Requirement: Drop 5 columns from VLT (nearly 400 gb) in size.

The moment we try do the same, we face space issues on PRODUCTION, timeout errors (thru SSMS)

We tried to insert in any temp table (by keeping identity off), but then we inserted all nearly billions rows of data and we tried to switch the identity on, we are facing timeout errors.

should we do these operations thru POWERSHELL would be better as compared to doing in SSMS

Limitation: Limited space on production, tempdb growing fast because of these operations.

Please advise, what could be the best approach to drop column from VLT.

Regards

like image 681
Conrad Jagger Avatar asked Feb 14 '12 16:02

Conrad Jagger


1 Answers

I would take one of the approaches already mentioned but with some key modifications. Assuming you are on SQL Server 2008, do the following:

  1. Make a zero-length copy of your existing very large table with only the columns you want to keep:

    select top 0 {{column subset}} into tbl_tableB from tableA
    

    Be sure to also copy any indexes, constraints, etc. to the new table. Identity columns will be handled appropriately by the SELECT...INTO statement.

  2. Rename the original table; we will replace it with a view in the next step.

    exec sys.sp_rename @objname = 'tableA', @newname = 'tbl_tableA'
    
  3. Create a view using the original table name and UNION ALL:

    create view tableA
    as
    select {{column subset}} from tbl_tableA
    union all
    select {{column subset}} from tbl_tableB
    

    This will maintain some level of compatibility with applications querying the data. INSERTs, UPDATEs, and DELETEs will have to be handled via triggers on the view. The UNION ALL will prevent pressure in tempdb since there will be no sorting (versus a straight UNION), and we will never have more than one copy of a row in existence at a time.

  4. Use a DELETE combined with an OUTPUT clause to delete data in batches from the original table and simultaneously insert it into the new table:

    BEGIN TRAN
    DELETE TOP (1000) /* or whatever batch size you want */
    FROM
        tbl_tableA
    OUTPUT (
        DELETED.{{column subset}} /* have to list each column here prefixed by DELETED. */
    )
    INTO
        tbl_tableB (
            {{column subset}} /* again list each column here */
        )
    /* Check for errors */
    /* COMMIT or ROLLBACK */
    /* rinse and repeat [n] times */
    
  5. Once you're done with the DELETEs/INSERTs, drop the view, drop the original table, rename the new table:

    drop view tableA
    drop table tbl_tableA
    exec sys.sp_rename @objname = 'tbl_tableB', @newname = 'tableA'
    

The overriding merit of this approach is that the DELETE and INSERT happen simultaneously in the same transaction, meaning the data will always be in a consistent state. You can increase the size of the batch by changing the TOP clause, giving you more control over transaction log usage and blocking. I've tested this exact approach on tables with and without identity columns and it works great. On a very large table, it will take a while to run; could be several hours to several days but it will complete with the desired result.

like image 74
Mitch Schroeter Avatar answered Sep 19 '22 05:09

Mitch Schroeter