Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimized/Best way for reading/writing a shared resoruce

One of my need is to manage the shared resource (more like a log , with both read and write operation)

among different processes (thus multiple threads also) with in an application. The data should also be

persisted with system restarts hence it should be a physical file/database.

The shared resource is some data which has a key , value information. (so possible operation which can be done with this shared resource is add a new key value info ,

update/delete the existing key value info).

Hence I am thinking about using a xml file to store the info physically , and the sample content will

look like ,

<Root>
   <Key1>Value</Key1>
   <Key2>Value</Key2>
   <Key3>Value</Key3>
</Root>

The interface to do the read and operation will look like ,

    public interface IDataHandler
    {
       IDictionary<string,string> GetData();
       void SetData(string key,string value);
    }

I could assume that the data will not cross more than 500 MB hence the xml decision and if the data grows I will move it to DB. Also , writing of data will be more when compared to read operation.

Few queries/design considerations related to the above scenario are,

Is this fine to handle 500 MB of data in a xml file ?

Assuming the file as xml , Now how to take care of the performance consideration ?

  • I am thinking about caching (MemoryCache class in .Net) the data as Dictionary , this will enable

to achieve the performance during read operation , Is it ok to cache 500 MB of data in-memory or do we

have some other option?

  • Now , if I use the above cache mechanism , what should happen during the write operation :

  • Should I write the dictionary content into xml again during every write operation by converting the

whole dictionary to xml ? or - is there any way to update only portion of the xml file whose data is getting modified/added ? or any

other way to handle this scenario ? - Should I again improve the performance by putting the write operation into Queue and in a background

thread read the queue and enable the actual write operation so that the one who actually writes the data

will not get affected because of write to file ? - To handle multi-thread scenario , planning to use Mutex with global name , is there any other

better way to do it ?

I am sure , I am operation with few assumption and tried to built from there and if I am wrong with

certain assumptions then it would change most of the design concept. Hence entirely new solution is also

welcome(keeping performance as main criteria). Thanks in advance.

like image 263
srsyogesh Avatar asked Aug 16 '14 18:08

srsyogesh


3 Answers

As you said "write operation is more than read" I assume that the data grows much faster so my suggestion is to start of designing for Database. It does not require a full feature database like MSSQL or MYSQL, you can start of with SQL-Lite or MSSQL-Compact. This make your app future proof for the large data handling capacity.

Storing heavy read data like configurations which won't change much in RAM is the efficient way. My suggestion is to use some cache managers like MemoryCache or Enterprise Library Caching Block, this save you lot of time implement thread safe data access and nightmares :) instead of writing your own.

public interface IDataHandler
{
   IDictionary<string,string> GetData();
   void SetData(string key,string value);
}

public class MyDataHandler : IDataHandler
{
   public IDictionary<string,string> GetData()
   {
       return CacheManager.GetData("ConfigcacheKey") as IDictionary<string,string>;
   }

   public void SetData(string key,string value)
   {
       var data = GetData() ?? new Dictionary<string,string();
       if(data.ContainsKey(key)) data[key] = value;
       else data.Add(key,value);

       CacheManager.Add("ConfigcacheKey", data);

       // HERE write an async method to save the key,value in database or XML file
   }
}

If you are going with XML then you do not need to convert the dictionary to xml every time. Load the XML document in XmlDocument/XDocument object and use XPath to find the element to update the value or add a new element and save the document.

From performance point unless you do some crazy logic or handle huge (i mean very huge) data in GB's I recommend you to finish your app quickly using the already available battle tested components like Databases, CacheManagers which abstracts you from thread safe operations.

like image 101
cackharot Avatar answered Nov 17 '22 22:11

cackharot


I see two possible approaches to this problem:

  • Use of a database. IMO this is the preferred approach, since this is exactly the thing that databases are designed for: concurrent read/write access by multiple applications.
  • Use a "service" application that will manage the resource and can be accessed (Pipes, Sockets, SharedMem, ...) by other applications.

Critical points to remember:

  1. GlobalMutex doesn't work across multiple machines (The XML file may lie on a Network share. If cannot rule that out as "unsupported" then you shouldn't use a Mutex).
  2. "Lock File" can leak locks (e.g. If the process that created the Lock file is killed, the file may remain on the disk)
  3. XML is a very bad format if a file is repeatedly updated by multiple processes (e.g. if you need a "load-update-write" for each access this will have very poor performance).
like image 37
Daniel Avatar answered Nov 17 '22 23:11

Daniel


Base your solution on the design principles of this Stackoverflow answer:

How to effectively log asynchronously?

As you mention in one of your considerations, the above solution involves threading and queueing.

Also, instead of serializing the data to XML, you can probably get better performance using BinaryFormatter

like image 1
John Jesus Avatar answered Nov 17 '22 23:11

John Jesus