We've applications and services which use a lot of configuration most of which is currently hard-coded in the java code and spread over a lot of classes. Obviously, this needs to change and we want this to be centralized at one place and that should be retrieved and exposed by one of the services (say ConfigurationService) which also incorporates caching the configuration for its clients for better performance. We also need to have dynamic reloading of the configuration for long-running applications and avoid restarts. I would like to get some comments on the kind of storage I should be using for this purpose -
The data need not be structured. It could be a simple key-value pair or it could be multi-key, single value pair. Here are some random examples of configuration -
Basically, the set and type of parameters which form a key are not fixed which gives me a hint that this configuration is not really structured. The volume of entire configuration won't really be huge. There will be very few writes as compared to the reads.
Database (RDBMS / NoSQL) - The advantage of using a database table could be security and backups it provides. Since this doesn't look like relational data, I would consider a NoSQL solution. I've not really used any of them personally, please tell me which one suits this kind of data better. As there could be a lot of different keys, we should be able to pick exact keys (some kind of indexing). Database usage will introduce latency, but efficient caching can be built to overcome this (as there won't be too many writes to the configuration). The data is easier to query.
Files (XML or other flat files) - We can keep it simple using the files. Caching can be used in case of files as well. As long as the entire configuration can be imported in memory (RAM), that's an option as well (selective cache invalidation has to be implemented). Files provide versioning, permissions/security has to be looked into. XML files especially can become messier if they grow large. The data may not be easy to query if we're using files.
Which should be a better storage solution assuming that dynamic reloading and cache invalidation are implemented separately? What other factors should be considered here?
If files are to be used to store such configuration, what are the better file formats for such use-cases?
Note: I asked a similar question on SO, but probably didn't frame the question as clearly as I should have, so created a new one instead of making heavy edits.
Memory. Similar to the processor, data storage servers are generally not memory intensive. In order to facilitate operating system usage and other application performance on the server, we would generally recommend 16GB – 32GB, which is fairly standard for modern servers of any kind.
Please, please do yourself a favor and evaluate whether or not Archaius or Curator are appropriate for your needs. Archaius is probably more appropriate for application and container configuration where Curator is probably better for machine level configuration.
The examples you provide suggest you might want some sort of rules engine. To show what I mean, I interpret your examples as having the following semantics:
if (true) {
Client_Id = "ABC";
}
if (User_Type == "Admin" && Region == "Mumbai" && "User_rating" == "9") {
Commission = "10%";
}
if (User_Id == "123") {
WhitelistedRegions" = ["Mumbai", "Goa"];
}
If my interpretation is wrong, then perhaps you could edit your question to clarify your intended meaning. On the other hand, if my interpretation is correct, then I am not aware of any particular configuration syntax that is tailor-made for your requirements. Instead, I suspect you will have to shoehorn the semantics of what you want into the constraints of whatever configuration syntax you decide to use.
The way I might try to shoehorn (my interpretation of) your examples into the syntax of Config4* (disclaimer: I am its main developer) is as follows:
uid-rule {
# unconditional
client_Id = "ABC";
}
uid-rule {
condition {
User_Type = "Admin";
Region = "Mumbai";
User_rating = "9";
}
Commission = "10%%";
}
uid-rule {
condition { User_Id = "123"; }
WhitelistedRegions = ["Mumbai", "Goa"];
}
I recommend you read Chapter 2 of the Config4* Getting Started manual (HTML, PDF) so you can understand the syntax used in the above example.
My initial attempt at shoehorning your examples into XML syntax is:
<rules>
<rule>
<property name="client_Id" value="ABC"/>
</rule>
<rule>
<condition name="User_Type" value="Admin"/>
<condition name="Region" value="Mumbai"/>
<condition name="User_rating" value="9"/>
<property name="Commission" value="10%"/>
</rule>
<rule>
<condition name="User_Id" value="123"/>
<property name="WhitelistedRegions" value="Mumbai, Goa"/>
</rule>
</rules>
Note that neither a Config4* parser nor an XML parser will give you the semantics you want out-of-the-box. Instead, you should write a class called, say, RulesEngine
. Such a class would: (1) parse a configuration file to obtain the rules and cache the parsed representation in memory; (2) provide a simple API for querying that in-memory set of rules; and (3) provide a reloadConfiguration()
method that re-parses the configuration file. Your application would somehow trigger the invocation of the reloadConfiguration()
method (for example, once every few minutes).
If you use XML for your configuration syntax, then I suggest that your achieve your centralization goal by storing the XML file on a web server. The XML parser can retrieve the file form there. If you use Config4* syntax, then the Config4* integration with curl
makes it possible to do the same thing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With