Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a table in Azure SQL Database from Blob Storage

I need to take large tables out of our Azure Data Warehouse and move them over to stand alone Azure SQL Databases. I haven't been able to get the Data Factory to work quickly enough for my scenario. I can get my tables into Blob storage from my Data Warehouse via external tables. What I can not figure out is how to create an external table on an Azure SQL Database with an external data source to my Blob storage.

This is the format file, external data source, and external table used to get my table into blob storage:

CREATE EXTERNAL FILE FORMAT [DelimitedText] 
WITH (
    FORMAT_TYPE = DELIMITEDTEXT, 
    FORMAT_OPTIONS (
        FIELD_TERMINATOR = N'~¶~', 
        USE_TYPE_DEFAULT = False
    ), 
    DATA_COMPRESSION = N'org.apache.hadoop.io.compress.GzipCodec')
GO

CREATE EXTERNAL DATA SOURCE [myDataSource] 
WITH (
    TYPE = HADOOP, 
    LOCATION = N'wasbs://<blob container>@<storage account>.blob.core.windows.net', 
    CREDENTIAL = [myCredential])
GO

CREATE EXTERNAL TABLE [dbo].[myTable] 
WITH (
    DATA_SOURCE = [myDataSource] ,
    LOCATION = N'MY_FOLDER/',
    FILE_FORMAT = [DelimitedText]
)
AS
SELECT *
FROM dbo.mytable

The only external data source I'm able to create in the Azure SQL Database is of TYPE=SHARD_MAP_MANAGER is that right or necessary? This link looks like I should be able to create an external data source using TYPE=HADOOP but I get an "error near EXTERNAL" error. I'm also unable to create an EXTERNAL FILE FORMAT. Is that possible in Azure SQL Database?

https://msdn.microsoft.com/en-us/library/dn935022.aspx#Examples: Azure SQL Database

Ultimately, I'm trying to create an external table to my blob storage and then insert into a table in my Azure SQL Database from that blob. Then drop the container.

like image 387
BamBamBeano Avatar asked Jan 20 '17 15:01

BamBamBeano


2 Answers

It is not possible to use PolyBase features on Azure SQL Database, only in on-premise SQL Server 2016 databases.

In the article there is a note:

PolyBase is supported only on SQL Server 2016, Azure SQL Data Warehouse, and Parallel Data Warehouse. Elastic Database queries are supported only on Azure SQL Database v12 or later.

Instead you could create an Azure SQL Data Warehouse (on the same Azure SQL Server if you wish). The guide will work for you if you run on that instead. It will not work for Hadoop (https://msdn.microsoft.com/en-us/library/mt703314.aspx), but as I understand your question you are importing from an Azure blob storage, and that will work on Azure SQL data warehouse.

like image 147
yoape Avatar answered Oct 12 '22 21:10

yoape


Azure SQL Database has recently gained the ability to load files from Azure Blob Storage using either BULK INSERT or OPENROWSET. Start here.

Two simple code examples taken from the linked article:

BULK INSERT Product
FROM 'data/product.dat'
WITH ( DATA_SOURCE = 'MyAzureBlobStorageAccount');

SELECT Color, count(*)
FROM OPENROWSET(BULK 'data/product.bcp', DATA_SOURCE = 'MyAzureBlobStorage',
 FORMATFILE='data/product.fmt', FORMATFILE_DATA_SOURCE = 'MyAzureBlobStorage') as data
GROUP BY Color;
like image 37
wBob Avatar answered Oct 12 '22 19:10

wBob