I have a system that checks the status of a large number of entities on schedule every minute. For each entity, there would be a JSON file which has fields indicating the statuses for different attributes. The system dumps these JSON files on a network share.
Each run of the schedule that runs every minute generates a JSON with 20k odd entities like these having tens of attributes.
[
{
"entityid": 12345,
"attribute1": "queued",
"attribute2": "pending"
},
{
"entityid": 34563,
"attribute1": "running",
"attribute2": "successful"
}
]
I need to be able to track the change of attribute status of the entities over time, for instance, answer questions like when did the status of entity x
become "pending". What is the best way to store this data and generate the stats?
You should store your data in a database. If your data have always the same structure, you could use a "classical" database like Postgresql or Mysql. If your data is of irregular shape, look at NoSQL databases like MongoDB. If you need to get your data in JSON you can easily export data from database to JSON.
Here is an article which discuss about JSON and database : https://hashrocket.com/blog/posts/faster-json-generation-with-postgresql
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With