Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add an identity column to an existing database table which has large number of rows

I have a database table which has ~ 40 000 000 rows. I want to add an identity column to this table. How to do it in a log-friendly manner?

When I do the following:

ALTER TABLE table_1
  ADD id INT IDENTITY

this just fills up the entire log space.

Is there any way to do it in a log-friendly manner? The database is on SQL Server 2008.

Thanks, Mohan.

like image 827
Mohan Avatar asked Aug 23 '11 15:08

Mohan


People also ask

How do I add an identity column to an existing column?

You cannot alter a column to be an IDENTITY column. What you'll need to do is create a new column which is defined as an IDENTITY from the get-go, then drop the old column, and rename the new one to the old name.

Can I add identity column to existing table?

To add an IDENTITY column to a table, the table must be at a top level. You cannot add an IDENTITY column as the column of a deeply embedded structured datatype. Adding a column does not affect the existing rows in the table, which get populated with the new column's default value (or NULL).

What if identity column reaches max?

Each time a row is inserted into the table, the identity column is assigned the next highest value. If the identity reaches the maximum value, inserts will fail.


2 Answers

The overall process will probably be a lot slower with more overall locking overhead but if you only care about transaction log size you could try the following.

  1. Add a nullable integer non identity column (metadata only change).
  2. Write code to update this with unique sequential integers in batches. This will reduce the size of each individual transaction and keep the log size down (assuming simple recovery model). My code below does this in batches of 100 hopefully you have an existing PK you can leverage to pick up where you left off rather than the repeated scans that will take increasingly long towards the end.
  3. use ALTER TABLE ... ALTER COLUMN to mark the column as NOT NULL. This will require the entire table to be locked and scanned to validate the change but not require much logging.
  4. Use ALTER TABLE ... SWITCH to make the column an identity column. This is a metadata only change.

Example Code Below

/*Set up test table with just one column*/

CREATE TABLE table_1 ( original_column INT )
INSERT  INTO table_1
        SELECT DISTINCT
                number
        FROM    master..spt_values



/*Step 1 */
ALTER TABLE table_1 ADD id INT NULL



/*Step 2 */
DECLARE @Counter INT = 0 ,
    @PrevCounter INT = -1

WHILE @PrevCounter <> @Counter 
    BEGIN
        SET @PrevCounter = @Counter;
        WITH    T AS ( SELECT TOP 100
                                * ,
                                ROW_NUMBER() OVER ( ORDER BY @@SPID )
                                + @Counter AS new_id
                       FROM     table_1
                       WHERE    id IS NULL
                     )
            UPDATE  T
            SET     id = new_id
        SET @Counter = @Counter + @@ROWCOUNT
    END


BEGIN TRY;
    BEGIN TRANSACTION ;
     /*Step 3 */
    ALTER TABLE table_1 ALTER COLUMN id INT NOT NULL

    /*Step 4 */
    DECLARE @TableScript NVARCHAR(MAX) = '
    CREATE TABLE dbo.Destination(
        original_column INT,
        id INT IDENTITY(' + CAST(@Counter + 1 AS VARCHAR) + ',1)
        )

        ALTER TABLE dbo.table_1 SWITCH TO dbo.Destination;
    '       

    EXEC(@TableScript)


    DROP TABLE table_1 ;

    EXECUTE sp_rename N'dbo.Destination', N'table_1', 'OBJECT' ;


    COMMIT TRANSACTION ;
END TRY
BEGIN CATCH
    IF XACT_STATE() <> 0 
        ROLLBACK TRANSACTION ;
    PRINT ERROR_MESSAGE() ;
END CATCH ;
like image 63
Martin Smith Avatar answered Oct 18 '22 18:10

Martin Smith


There are two ways of adding an identity column to a table with existing data:

  1. Create a new table with identity, copy data to this new table then drop the existing table followed by renaming the temp table.

  2. Create a new column with identity & drop the existing column

Reference : http://cavemansblog.wordpress.com/2009/04/02/sql-how-to-add-an-identity-column-to-a-table-with-data/

like image 1
NG. Avatar answered Oct 18 '22 17:10

NG.