Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create Blobs with Filename and Data from SQL Rows

There is plenty of documentation on how to use Azure Data Factory to read data from blobs into SQL, and even documentation on how to dump the output of a query to a single blob. I'm trying to create one blob for each row in a table (on Azure SQL Server), named by one field and containing the data in another.

My table has a GUID id field and a nvarchar(max) data field (which contains JSON, though that's largely irrelevant). Suppose I have the following rows:

                 id                    |  data
---------------------------------------+----------
38b2f551-5f13-40ce-8512-c108a05ecd44   |  foo
4db5b25b-1194-44e9-a7b2-bc8889c32979   |  bar
2a3bd653-ce14-4bd2-9243-6923e97224c6   |  baz

I want the following blobs to be created:

https://mycontainer.blob.core.windows.net/myblobs/38b2f551-5f13-40ce-8512-c108a05ecd44
https://mycontainer.blob.core.windows.net/myblobs/4db5b25b-1194-44e9-a7b2-bc8889c32979
https://mycontainer.blob.core.windows.net/myblobs/2a3bd653-ce14-4bd2-9243-6923e97224c6

Their contents should be the associated data field, i.e. foo, bar, and baz, respectively.

Data Factory v2 - Generate a json file per row has an answer that will work, but it involves querying the database once to get all the ids, then N more times to get the data from each row. It seems like it should be possible to query just once for both fields and use one for filename and one for contents, but I haven't been able to figure out how.

like image 792
Gregory Seidman Avatar asked Mar 05 '19 15:03

Gregory Seidman


Video Answer


1 Answers

This is a pretty simple feat natively in ADF:

  1. Use a Lookup Activity to get the row results of your Database Table
  2. Use a For Each to Iterate over each row from the Lookup
  3. Within the For Each have a Copy Data activity that copies from DB (as a passthrough really) to Blob Storage

Here are screenshots with comments to illustrate the exact steps and proof that it works:

1st, here is my mock table that is similar to your example, I have a pretend FileID, and Data field

Mock Table

2nd, I do a lookup on that table, see the details and output below Lookup Activity Example

3rd, we put in the For Each activity so that we can perform an action or group of actions per item in the array object

For each Example

4th, we go into the For each and create a Copy Data activity with a Source of Azure SQLDB and a Sink of Azure Blob

Source Details: Copy Data Source

Sink Details: Copy Data Sink

And here is what happens in my Blob Storage when I run this pipeline:

Proof it works

Let me know if you need anything else or have any questions, hope this helps! Remember to like, love, and follow :)

like image 196
Trent Tamura Avatar answered Sep 30 '22 11:09

Trent Tamura