I wonder if someone could take a minute out of their day to give their two cents on my problem.
I would like some suggestions on what would be the best data structure for representing, on disk, a large data set of time series data. The main priority is speed of insertion, with other priorities in decreasing order; speed of retrieval, size on disk, size in memory, speed of removal.
I have seen that B+ trees are often used in database because of their fast search times, but how about for fast insertion times? Is a linked list really the way to go?
The TimeSeries data type is a constructor data type that groups together a collection of ROW data type in time stamp order. A ROW data type consists of a group of named columns. The rows in a TimeSeries data type, called elements, each represent one or more data values for a specific time stamp.
A line graph is the simplest way to represent time series data. It is intuitive, easy to create, and helps the viewer get a quick sense of how something has changed over time. A line graph uses points connected by lines (also called trend lines) to show how a dependent variable and independent variable changed.
Arrays. An array is a linear data structure that holds an ordered collection of values. It's the most efficient in storing and accessing a sequence of objects.
You might want to look into HDF5 (Hierarchical Data Format). It's well suited for Time Series data. Implementation wise, it uses B Trees.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With