Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure - Updating an existing xml file in BLOB storage

I have XML files stored in BLOB storage, and I am trying to figure out what is the most efficient way to update them ( and/or add some elements to them). In a WebRole, I came up with this :

using (MemoryStream ms = new MemoryStream())
{                    
      var blob = container.GetBlobReference("file.xml");
      blob.DownloadToStream(msOriginal);
      XDocument xDoc= XDocument.Load(ms);  

      // Do some updates/inserts using LINQ to XML.  

      blob.Delete();//Details about this later on.

      using(MemoryStream msNew = new MemoryStream())  
      {
           xDoc.Save(msNew);
           msNew.Seek(0,SeekOrigin.Begin);
           blob.UploadFromStream(msNew);                    
      }                               
}

I am looking at these parameters considering the efficiency:

  1. BLOB Transactions.
  2. Bandwidth. (Not sure if it's counted, because the code runs in the data-center)
  3. Memory consumption on the instance.

Some things to mention:

  • My xml files are around 150-200 KB.

  • I am aware of the fact that XDocument loads the whole file into memory, and working in streams ( XmlWriter and XmlReader ) could solve this. But I Assume this will require working with BlobStream which could lead to less efficient transaction-wise (I think).

  • About blob.Delete(), without it, the uploaded xml in the blob storage seems to be missing some closing tags at the end of it. I assumed this is caused by a collision with the old data. I could be completely wrong here, but using the delete solved it ( costing one more transaction though ).

Is the code I provided is a good practice or maybe a more efficient way exists considering the parameters I mentioned ?

like image 205
Yaron Levi Avatar asked Oct 04 '11 20:10

Yaron Levi


People also ask

How do you update blobs?

You cannot update a Blob directly. You must create a new Blob, read the old Blob data into a buffer where you can edit or modify it, then write the modified data to the new Blob.

How do you sync files to Azure Blob storage?

You can synchronize local storage with Azure Blob storage by using the AzCopy v10 command-line utility. You can synchronize the contents of a local file system with a blob container. You can also synchronize containers and virtual directories with one another. Synchronization is one way.

What is blob change feed?

The change feed enables you to build efficient and scalable solutions that process change events that occur in your Blob Storage account at a low cost.


1 Answers

I believe the problem with the stream based method is that the storage client doesn't know how long the stream is before it starts to send the data. This is probably causing the content-length to not be updated, giving the appearance of missing data at the end of the file.

Working with the content of the blob in text format will help. You can download the blob contents as text and then upload as text. Doing this, you should be able to both avoid the delete (saving you 1/3rd the transactions) and have simpler code.

var blob = container.GetBlobReference("file.xml");
var xml = blob.DownloadText(); // transaction 1
var xDoc= XDocument.Parse(xml);

// Do some updates/inserts using LINQ to XML.

blob.UploadText(xDoc.ToString()); //  transaction 2

Additionally, if you can recreate the file without downloading it in the first place (we can do this sometimes), then you can just upload it and overwrite the old one using one storage transaction.

var blob = container.GetBlobReference("file.xml");
var xDoc= new XDocument(/* generate file */);

blob.UploadText(xDoc.ToString()); // transaction 1
like image 152
The Big Sadowski Avatar answered Oct 21 '22 22:10

The Big Sadowski