I am trying to identify possible methods for storing 100 channels of 25 Hz floating point data. This will result in 78,840,000,000 data-points per year.
Ideally all this data would be efficiently available for Web-sites and tools such as Sql Server reporting services. We are aware that relational databases are poor at handling time-series of this scale but have yet to identify a convincing time-series specific database.
Key issues are compression for efficient storage yet also offering easy and efficient queries, reporting and data-mining.
How would you handle this data?
Are there features or table designs in Sql Server that could handle such a quantity of time-series data?
If not, are there any 3rd party extensions for Sql server to efficiently handle mammoth time-series?
If not, are there time-series databases that specialise in handling such data yet provide natural access through Sql, .Net, and Sql Reporting services?
thanks!
I'd partition the table by, say, date, to split the data into tiny bits of 216,000,000
rows each.
Provided that you don't need a whole-year statistics, this is easily servable by indexes.
Say, the query like "give me an average for the given hour" will be a matter of seconds.
I suppose you need a random access to the data series. The idea that I've already used for rainfall data table, is to subdivide the entire dataset in smaller part, to create an entry for each few minutes or even one minute. Then you can pop this, still big, array from the db and access directly to the needed part, you can find a direct correlation between time offset and byte offset.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With