Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Granularity of "Block incremental deployment if data loss might occur"

In SQL Server data Tools you have the deployment option "Block incremental deployment if data loss might occur", which I'd wager is a best practice to keep checked.

Lets say we have a table foo, and a column bar which is now redundant - has no dependencies, foreign keys etc etc, and we have already removed references to this column in our data layer and stored procedures as it's simply not used. In other words, we are satisfied that dropping this column will have no adverse effects.

There are a couple of flies in the ointment:

  1. The column has data in it
  2. The database is published to hundreds of distributed clients, and it could take months for the change to ripple out to all clients

As the column is populated, publishing will fail unless we change the "Block incremental deployment if data loss might occur" option. This option is at the database level, not table level however, and so due to the distributed nature of the clients, we'd have to turn off the "data loss" option for months before all databases were updated, and turn it back on once all clients have updated (our databases have version numbers set by our build).

You may think we could solve this with a pre-deployment script such as

if exists (select * from information_schema.columns where table_name = 'foo' and column_name = 'bar') BEGIN
  alter table foo drop constraint DF_foo_bar
  alter table foo drop column bar 
END

But again this fails unless we turn the "data loss could occur" option off.

I'm simply interested as to what others have done in this scenario as I'd like to have granularity which doesn't currently seem possible.

like image 532
Fetchez la vache Avatar asked Aug 12 '13 09:08

Fetchez la vache


2 Answers

So I've been accomplishing this task via the following steps:

1) Since we are going to make table #Foo, make sure to drop that table before moving forward if it exists.
2) In a pre-deployment script: If the column exists, create a temporary table #Foo and select all rows from Foo into #Foo.
3) Remove the column from #Foo
4) Delete all rows in Foo (now there will be no data loss since no data exists)
5) In a post-deployment script: If #Foo exists, select all rows from #Foo into Foo
6) Drop table #Foo

And code:

pre-deployment script

if(Object_ID('TempDB..#Foo') is not null)
begin
    drop table #Foo
end
if exists (
    select *
    from sys.columns
    where Name = 'Bar'
        and Object_ID = Object_ID('Foo')
)
begin    
    select * into #Foo
    from Foo

    alter table #Foo drop column Bar

    -- Now that we've made a complete backup of Foo, we can delete all its data
    delete Foo
end

post-deployment script

if(Object_ID('TempDB..#Foo') is not null)
begin
    insert into Foo
    select * from #Foo

    drop table #Foo
end

Caveat: Depending on your environment, it might be wiser to depend on versions rather than column & temp table existence in your conditionals

like image 59
aaaaaa Avatar answered Sep 29 '22 11:09

aaaaaa


The PreDeployment script doesn't work the way you are hoping to use it because of the order of operations for SSDT:

  1. Schema Comparison
  2. Script generation for schema difference
  3. Execute PreDeployment
  4. Excecute generated script
  5. Execute PostDeployment.

So of course, the schema difference is identified as part of #2 and appropriate SQL is generated to drop the column (including the check to block on data loss), before your manual pre-deployment script can 'get rid of it'.

If you take a look at the script generated behind the scenes to detect (and therefore block) on possible data loss, it checks to see if there are any rows by running something along the lines of this:

IF EXISTS (select top 1 1 from [dbo].[Table]) RAISERROR ('Rows were detected. The schema update is terminating because data loss might occur.', 16, 127)

This means the simple existence of rows will stop the column being dropped. We haven't found any way around this other than manually dealing with the problem outside (and before) the SSDT deployment, using conditional deployment steps based on version numbers.

You mention distributed clients, which implies you have some sort of automated publication/update mechanism. You also mention version numbers as part of the database - could you include in your deploy (before the sqlpackage.exe command I assume you are running) a manual SQL script? This is akin to what we do (ours is in Powershell, but you get the gist):

IF VersionNumber < 2.8 
BEGIN
    ALTER TABLE X DROP COLUMN Y
END

Disclaimer: in no way is that valid SQL, it's simply pseudo code to imply an idea!

like image 31
hanzworld Avatar answered Sep 29 '22 10:09

hanzworld