How to download 100 million rows from Azure Table Storage FAST

Tags:

I have been tasked with downloading around 100 million rows of data from Azure Table Storage. The important thing here being speed.

The process we are using is downloading 10,000 rows from Azure Table storage. Process them into a local instance of Sql Server. While processing the rows it deletes 100 rows at a time from the Azure table. This process is threaded to have 8 threads downloading 10,000 rows at a time.

The only problem with this is that according to our calculations. It will take around 40 days to download and process the around 100 million rows we have stored. Does anyone know a faster way to accomplish this task?

A side question: During the download process Azure will send back xml that just does not have any data. It doesn't send back an error. But it sends this:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<feed xml:base="azure-url/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" xmlns="http://www.w3.org/2005/Atom">
  <title type="text">CommandLogTable</title>
  <id>azure-url/CommandLogTable</id>
  <updated>2010-07-12T19:50:55Z</updated>
  <link rel="self" title="CommandLogTable" href="CommandLogTable" />
</feed>
0

Does anyone else have this problem and have a fix for it?

755

asked Jul 12 '10 19:07

jWoose

1 Answers

Aside from suggestions about bandwidth limits, you could easily be running into storage account limits, as each table partition is limited to roughly 500 transactions per second.

Further: there's an optimization deployed (Nagle's algorithm) that could actually slow things down for small reads (such as your 1K data reads). Here's a blog post about disabling Nagling, which could potentially speed up your reads considerably, especially if you're running directly in an Azure service without Internet latency in the way.

answered Sep 27 '22 22:09

David Makogon

Related questions
                            
                                Azure web app cannot find the connection string
                            
                                Error with Azure service SSL in Development Fabric
                            
                                How to track progress of async file upload to azure storage
                            
                                In Azure Active directory user disable option is there?
                            
                                Overriding Configuration Values in config.json file in Azure Web App in ASP.Net 5
                            
                                Trigger Azure Pipelines build via API
                            
                                How can we mount azure blob storage as a network drive in windows?
                            
                                Latency between Azure Web Role and SQL Azure and Application performance
                            
                                Get-AzureStorageBlob throws Can not find your azure storage credential
                            
                                Azure Worker Role compatibillity with .Net 4.5.2
                            
                                ASP.NET: Publishing Website doesn't publish Resources folder
                            
                                Autofac Dependency Injection in Azure Function
                            
                                Connect to On Prem SQL server from Azure Web app
                            
                                Azcopy error "This request is not authorized to perform this operation."
                            
                                MS Graph API: invalid authentication token
                            
                                Assigning an Active Directory Administrator to an Azure SQL instance through ARM Templates
                            
                                Windows Azure: length of blob remains 0
                            
                                Can't provide NuGet package source credentials to Azure Function
                            
                                Good Strategy for Message Queuing?
                            
                                npm ERR! 404 Not Found: [email protected]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to download 100 million rows from Azure Table Storage FAST

Tags:

azure

azure-storage

azure-table-storage

jWoose

People also ask

1 Answers

David Makogon

Recent Activity

Donate For Us