Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Table Storage transaction limitations

I'm running performance tests against ATS and its behaving a bit weird when using multiple virtual machines against the same table / storage account.

The entire pipeline is non blocking (await/async) and using TPL for concurrent and parallel execution.

First of all its very strange that with this setup i'm only getting about 1200 insertions. This is running on a L VM box, that is 4 cores + 800mbps.

I'm inserting 100.000 rows with unique PK and unique RK, that should leverage the ultimate distribution.

Even more deterministic behavior is the following.

When I run 1 VM i get about 1200 insertions per second. When I run 3 VM i get about 730 on each insertions per second.

Its quite humors to read the blog post where they are specifying their targets. https://azure.microsoft.com/en-gb/blog/windows-azures-flat-network-storage-and-2012-scalability-targets/

Single Table Partition– a table partition are all of the entities in a table with the same partition key value, and usually tables have many partitions. The throughput target for a single table partition is:

Up to 2,000 entities per second

Note, this is for a single partition, and not a single table. Therefore, a table with good partitioning, can process up to the 20,000 entities/second, which is the overall account target described above.

What shall I do to be able to utilize the 20k per second, and how would it be possible to execute more than 1,2k per VM?

--

Update:

I've now also tried using 3 storage accounts for each individual node and is still getting the performance / throttling behavior. Which i can't find a logical reason for.

--

Update 2:

I've optimized the code further and now i'm possible to execute about 1550.

--

Update 3:

I've now also tried in US West. The performance is worse there. About 33% lower.

--

Update 4:

I tried executing the code from a XL machine. Which is 8 cores instead of 4 and the double amount of memory and bandwidth and got a 2% increase in performance so clearly this problem is not on my side..

like image 897
ptomasroos Avatar asked Jan 21 '13 16:01

ptomasroos


1 Answers

A few comments:

  1. You mention that you are using unique PK/RK to get ultimate distribution, but you have to keep in mind that the PK balancing is not immediate. When you first create a table, the entire table will be served by 1 partition server. So if you are doing inserts across several different PKs, they will still be going to one partition server and be bottlenecked by the scalability target for a single partition. The partition master will only start splitting your partitions among multiple partition servers after it has identified hot partition servers. In your <2 minute test you will not see the benefit of multiple partiton servers or PKs. The throughput in the article is targeted towards a well distributed PK scheme with frequently accessed data, causing the data to be divided amongst multiple partition servers.

  2. The size of your VM is not the issue as you are not blocked on CPU, Memory, or Bandwidth. You can achieve full storage performance from a small VM size.

  3. Check out http://research.microsoft.com/en-us/downloads/5c8189b9-53aa-4d6a-a086-013d927e15a7/default.aspx. I just now did a quick test using that tool from a WebRole VM in the same datacenter as my storage account and I acheived, from a single instance of the tool on a single VM, ~2800 items per second upload and ~7300 items per second download. This is using 1024 byte entities, 10 threads, and 100 batch size. I don't know how efficient this tool is or if it disables Nagles Algorithm as I was unable to get great results (I got ~1000/second) using a batch size of 1, but at least with the 100 batch size it shows that you can achieve high items/second. This was done in US West.

  4. Are you using Storage client library 1.7 (Microsoft.Azure.StorageClient.dll) or 2.0 (Microsoft.Azure.Storage.dll)? The 2.0 library has some performance improvements and should yield better results.

like image 106
kwill Avatar answered Oct 02 '22 12:10

kwill