Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using "SELECT INTO" with Azure SQL to copy data from another DB

I'm trying to automate the initialising of a SQL DB on Azure. For some (lookup) tables, data needs to be copied from a source DB into the new DB each time it is initialised.

To do this I execute a query containing

SELECT * INTO [target_db_name]..[my_table_name] FROM [source_db_name].dbo.[my_table_name]

At this point an exception is thrown telling me that

Reference to database and/or server name in 'source_db_name.dbo.my_table_name' is not supported in this version of SQL Server.

Having looked into this, I've found that it's now possible to reference another Azure SQL DB provided it has been configured as an external data source. [here and here]

So, in my target DB I've executed the following statement:

CREATE MASTER KEY ENCRYPTION BY PASSWORD = '<password>';

CREATE DATABASE SCOPED CREDENTIAL cred  
WITH IDENTITY = '<username>',
SECRET = '<password>';

CREATE EXTERNAL DATA SOURCE [source_db_name]
WITH
(
    TYPE=RDBMS,
    LOCATION='my_location.database.windows.net',
    DATABASE_NAME='source_db_name',
    CREDENTIAL= cred
);

CREATE EXTERNAL TABLE [dbo].[my_table_name](
    [my_column_name] BIGINT NOT NULL
)
WITH
(
    DATA_SOURCE = [source_db_name],
    SCHEMA_NAME = 'dbo',
    OBJECT_NAME = 'my_table_name'
)

But the SELECT INTO statement still yields the same exception.

Furthermore, a simple SELECT * FROM [source_db_name].[my_table_name] yields the exception "Invalid object name 'source_db_name.my_table_name'".

What am I missing?

UPDATE
I've found the problem: CREATE EXTERNAL TABLE creates what appears to be a table in the target DB. To query this, the source DB name should not be used. So where I was failing with:

SELECT * FROM [source_db_name].[my_table_name]

I see that I should really be querying

SELECT * FROM [my_table_name]
like image 628
awj Avatar asked Aug 29 '16 15:08

awj


People also ask

How do I copy data from one SQL database to another?

On either the source or destination SQL Server instance, launch the Copy Database Wizard in SQL Server Management Studio from Object Explorer and expand Databases. Then right-click a database, point to Tasks, and then select Copy Database.

How do I copy data from a different database?

Select and right-click on the Source Database, go to Tasks > Export Data. Import/Export Wizard will be opened and click on Next to proceed. Enter the data source, server name and select the authentication method and the source database. Click on Next.


2 Answers

It looks like you might need to define that external table, according to what appears to be the correct syntax:

CREATE EXTERNAL TABLE [dbo].[source_table](
...
)
WITH
(
DATA_SOURCE = source_db_name
);

The three part name approach is unsupported, except through elastic database query.

Now, since you're creating an external table, the query can pretend the external table is an object native to our [target_db]- this allows you to write the query SELECT * FROM [my_table_name], as you figured out from your edits. From the documentation, it is important to note that "This allows for read-only querying of remote databases." So, this table object is not writable, but your question only mentioned reading from it to populate a new table.

like image 95
Dan Rediske Avatar answered Oct 04 '22 21:10

Dan Rediske


As promised, here's how I handle database deploys for SQL Server. I use the same method for on-prem, Windows Azure SQL Database, or SQL on a VM in Azure. It took a lot of pain, trial and error.

It all starts with SQL Server Data Tools, SSDT If you're not already using SSDT to manage your database as a project separate from your applications, you need to. Grab a copy here. If you are already running a version of Visual Studio on your machine, you can get a version of SSDT specific for that version of Visual Studio. If you aren't already running VS, then you can just grab SSDT and it will install the minimal Visual Studio components to get you going.

Setting up your first Database project is easy! Start a new Database project. new project dialog

Then, right click on your database project and choose Import -> Database. Importing an existing database

Now, you can point at your current development copy of your database and import it's schema into your project. This process will pull in all the tables, views, stored procedures, functions, etc from the source database. When you're finished you will see something like the following image. your new database project

There is a folder for each schema imported, as well as a security folder for defining the schemas in your database. Explore these folders and look through the files created.

You will find all the scripts created are the CREATE scripts. This is important to remember for managing the project. You can now save your new solution, and then check it into your current source control system. This is your initial commit.

Here's the new thought process to managing your database project. As you need to make schema changes, you will come into this project to make changes to these create statements to define the state you want the object to be. You are always creating CREATE statements, never ALTER statements in your schema. Check out the example below.

Updating a table Let's say we've decided to start tracking changes on our dbo.ETLProcess table. We will need columns to track CreatedDateTime, CreatedByID, LastUpdatedDateTime, and LastUpdatedByID. Open the dbo.ETLProcess file in the dbo\Tables folder and you'll see the current version of the table looks like this:

CREATE TABLE [dbo].[ETLProcess] (
       [ETLProcessID] INT             IDENTITY (1, 1) NOT NULL
     , [TenantID]     INT             NOT NULL
     , [Name]         NVARCHAR (255)  NULL
     , [Description]  NVARCHAR (1000) NULL
     , [Enabled]      BIT             DEFAULT ((1)) NOT NULL
     , CONSTRAINT [PK_ETLProcess__ETLProcessID_TenantID] 
            PRIMARY KEY CLUSTERED ([ETLProcessID], [TenantID])
     , CONSTRAINT [FK_ETLProcess_Tenant__TenantID] 
            FOREIGN KEY ([TenantID]) 
            REFERENCES [dbo].[Tenant] ([TenantID])
 );

To record the change we want to make, we simply add in the columns into the table like this:

CREATE TABLE [dbo].[ETLProcess] (
       [ETLProcessID]         INT             IDENTITY (1, 1) NOT NULL
     , [TenantID]             INT             NOT NULL
     , [Name]                 NVARCHAR (255)  NULL
     , [Description]          NVARCHAR (1000) NULL
     , [Enabled]              BIT             DEFAULT ((1)) NOT NULL
     , [CreatedDateTime]      DATETIME        DEFAULT(GETUTCDATE())
     , [CreatedByID]          INT
     , [LastUpdatedDateTime]  DATETIME        DEFAULT(GETUTCDATE())
     , [LastUpdatedByID]      INT
     , CONSTRAINT [PK_ETLProcess__ETLProcessID_TenantID] 
            PRIMARY KEY CLUSTERED ([ETLProcessID], [TenantID])
     , CONSTRAINT [FK_ETLProcess_Tenant__TenantID] 
            FOREIGN KEY ([TenantID]) 
            REFERENCES [dbo].[Tenant] ([TenantID])
 );

I didn't add any foreign keys to the definition, but if you wanted to create them, you would add them below the Foreign Key to Tenant. Once you've made the changes to the file, save it.

The next thing you'll want to get in the habit of is checking your database to make sure it's valid. In the programming world, you'd run a test build to make sure it compiles. Here, we do something very similar. From the main menu hit Build -> Build Database1 (the name of our database project).

The output window will open and tell you if there are any problems with your project. This is where you'll see things like Foreign keys referencing tables that don't yet exist, bad syntax in your create object statements, etc. You'll want to clean these up before you check your update into source control. You'll have to fix them before you will be able to deploy your changes to your development environment.

Once your database project builds successfully and it's checked in to source control, you're ready for the next change in process.

Deploying Changes Earlier I told you it was important to remember all your schema statements are CREATE statements. Here's why: SSDT gives you two ways to deploy your changes to a target instance. Both of them use these create statements to compare your project against the target. By comparing two create statements it can generate ALTER statements needed to get a target instance up to date with your project.

The two options for deploying these changes are a T-SQL change script, or dacpac. Based on the original post, it sounds like the change script will be most familiar.

Right click on your database project and choose Schema Compare. Schema Compare

By default, your database project will be the source on the left. Click Select target on the right, and select the database instance you want to "upgrade". Then click Compare in the upper left, and SSDT will compare the state of your project with the target database.

You will then get a list of all the objects in your target database that are not in the project (in the DROP section), a list of all objects that are different between the project and target database (in the ALTER Section), and a list of objects that are in your project and not yet in your target database (in the ADD section).

Sometimes you'll see changes listed that you don't want to make (changes in the Casing of your object names, or the number of parenthesis around your default statements. You can deselect changes like that. Other times you will not be ready to deploy those changes in the target deployment, you can also deselect those. All items left checked will either be changed in target database, if you choose update (red box below), or added to your change script (green box below), if you hit the "Generate Script" icon. Deploy change options

Handling lookup data in your Database Project Now we're finally to your original question, how do I deploy lookup data to a target database. In your database project you can right click on the project in Solution Explorer and choose Add -> New Item. You'll get a dialog box. On the left, click on User Scripts, then on the right, choose Post-Deployment Script. Post Deployment step

By adding a script of this type, SSDT knows you want to run this step after any schema changes. This is where you will enter your lookup values, as a result they're included in source control!

Now here's a very important note about these post deployment scripts. You need to be sure any T-SQL you add here will work if you call the script in a new database, in an existing database, or if you called it 100 times in a row. As a result of this requirement, I've taken to including all my lookup values in merge statements. That way I can handle inserts, updates, and deletes.

Before committing this file to source control, test it in all three scenarios above to be sure it won't fail.

Wrapping it all up Moving from making changes directly in your target environments to using SSDT and source controlling your changes is a big step in the maturation of your software development life-cycle. The good news is it makes you think about your database as part of the deployment process in a way that is compatible with continuous integration/continuous deployment methods.

Once you get used to the new process, you can then learn how to add a dacpac generated from SSDT into your deployment scripts and have the changes pushed at just the right time in your deployment.

It also frees you from your SELECT INTO problem, your original problem.

like image 34
Shannon Lowder Avatar answered Oct 04 '22 22:10

Shannon Lowder