Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between time series database and streaming analytics engine like spark streaming

Can a time series database do everything that a streaming analytics system (like spark streaming / flink / kinesis analytics) can?

Does one subsume the other? I am not looking for which one is better. Just understanding what different use cases that they support.

like image 245
user855 Avatar asked Mar 17 '26 00:03

user855


1 Answers

Time series databases are focused on storage and retrieval of time-based entries in more performant ways than our common relational databases. Recently they have become again a hot topic, given the industry interest on high performance event processing. Nowadays, most of them rely on specific indexing techniques over NoSQL databases, e.g. OpenTSDB (HBase), InfluxDB (BoltDB) and so on.

On the other hand, Distributed Stream Processing frameworks like Spark Streaming are based on the research on Data Stream Management Systems and are provide more flexible ways of analysing events. They are usually applied to do other types of data analysis such as machine learning over streams, sketches, windowing and to apply multiple other techniques that are not the focus of time series databases.

Both of them are originated from the research from the 2000s on Time Series Databases and Data Stream Management Systems, so many of the features and architectural ideas from one are applied on the other and vice-versa. An example of that is that the seminal Stream Processing paper "Continuous Queries over Data Streams" (S. Babu, 2001) cites time series databases as an example of related work.

like image 112
otaviocarvalho Avatar answered Mar 19 '26 18:03

otaviocarvalho



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!