I have XML files stored in BLOB storage, and I am trying to figure out what is the most efficient way to update them ( and/or add some elements to them). In a WebRole, I came up with this :
using (MemoryStream ms = new MemoryStream())
{
var blob = container.GetBlobReference("file.xml");
blob.DownloadToStream(msOriginal);
XDocument xDoc= XDocument.Load(ms);
// Do some updates/inserts using LINQ to XML.
blob.Delete();//Details about this later on.
using(MemoryStream msNew = new MemoryStream())
{
xDoc.Save(msNew);
msNew.Seek(0,SeekOrigin.Begin);
blob.UploadFromStream(msNew);
}
}
I am looking at these parameters considering the efficiency:
Some things to mention:
My xml files are around 150-200 KB.
I am aware of the fact that XDocument loads the whole file into memory, and working in streams ( XmlWriter and XmlReader ) could solve this. But I Assume this will require working with BlobStream which could lead to less efficient transaction-wise (I think).
About blob.Delete(), without it, the uploaded xml in the blob storage seems to be missing some closing tags at the end of it. I assumed this is caused by a collision with the old data. I could be completely wrong here, but using the delete solved it ( costing one more transaction though ).
Is the code I provided is a good practice or maybe a more efficient way exists considering the parameters I mentioned ?
You cannot update a Blob directly. You must create a new Blob, read the old Blob data into a buffer where you can edit or modify it, then write the modified data to the new Blob.
You can synchronize local storage with Azure Blob storage by using the AzCopy v10 command-line utility. You can synchronize the contents of a local file system with a blob container. You can also synchronize containers and virtual directories with one another. Synchronization is one way.
The change feed enables you to build efficient and scalable solutions that process change events that occur in your Blob Storage account at a low cost.
I believe the problem with the stream based method is that the storage client doesn't know how long the stream is before it starts to send the data. This is probably causing the content-length to not be updated, giving the appearance of missing data at the end of the file.
Working with the content of the blob in text format will help. You can download the blob contents as text and then upload as text. Doing this, you should be able to both avoid the delete (saving you 1/3rd the transactions) and have simpler code.
var blob = container.GetBlobReference("file.xml");
var xml = blob.DownloadText(); // transaction 1
var xDoc= XDocument.Parse(xml);
// Do some updates/inserts using LINQ to XML.
blob.UploadText(xDoc.ToString()); // transaction 2
Additionally, if you can recreate the file without downloading it in the first place (we can do this sometimes), then you can just upload it and overwrite the old one using one storage transaction.
var blob = container.GetBlobReference("file.xml");
var xDoc= new XDocument(/* generate file */);
blob.UploadText(xDoc.ToString()); // transaction 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With