Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a powerful database system for time series data? [closed]

In multiple projects we have to store, aggregate, evaluate simple measurement values. One row typcially consists of a time stamp, a value and some attributes to the value. In some applications we would like to store 1000 values per second and more. These values must not only be inserted but also deleted at the same rate, since the lifetime of a value is restricted to a year or so (in different aggregation steps, we do not store 1000/s for the whole year).

Until now, we have developped different solutions. One based on Firebird, one on Oracle and one on some self-made storage mechanism. But none of these are very satisfying solutions.

Both RDBMS solutions cannot handle the desired data flow. Besides that, the applications that deliver the values (e.g. device drivers) cannot be easily attached to databases, the insert statements are cumbersome. And finally, while having an SQL interface to the data is strongly desired, typical evaluations are hard to formulate in SQL and slow in the execution. E.g. find the maximum value with time stamp per 15 minutes for all measurements during the last month.

The self-made solution can handle the insertion rate and has a client-friendly API to do it, but it has nothing like a query language and cannot be used by other applications via some standard interface e.g. for reporting.

The best solution in my dreams would be a database system that:

  • has an API for very fast insertion
  • is able to remove/truncate the values in the same speed
  • provides a standard SQL interface with specific support for typical time series data

Do you know some database that comes near those requirements or would you approach the problem in a different way?

like image 579
Kit Fisto Avatar asked Jan 11 '12 08:01

Kit Fisto


People also ask

What database is used for time series data?

Relational database management systems (RDBS), which are often considered general-purpose database systems, can be used to store and retrieve time series data.

How do you store time series data efficiently?

Storing time series data. Time series data is best stored in a time series database (TSDB) built specifically for handling metrics and events that are time-stamped. This is because time series data is often ingested in massive volumes that require a purpose-built database designed to handle that scale.

Which NoSQL database is best suitable for time series data management?

InfluxDB. InfluxDB is one of the most popular time series databases among DevOps, which is written in Go.


2 Answers

Most other answers seem to mention SQL based databases. NoSQL based databases are far superior at this kind of thing.

Some Open source time-series databases:

  • https://prometheus.io - Monitoring system and time series database
  • http://influxdb.com/ - time series database with no external dependencies (only basic server is open-source)
  • http://square.github.io/cube/ - Written ontop of MongoDB
  • http://opentsdb.net/ - Written on top of Apache HBase
  • https://github.com/kairosdb/kairosdb - A rewrite of OpenTSDB that also enables using Cassandra instead of Hadoop
  • http://www.gocircuit.org/vena.html - A tutorial on writing a substitute of OpenTSDB using Go-circuits
  • https://github.com/rackerlabs/blueflood - Based on Cassandra
  • https://github.com/druid-io/druid - Column oriented & hadoop based distributed data store

Cloud-based:

  • https://www.tempoiq.com
like image 169
Joakim Avatar answered Sep 21 '22 10:09

Joakim


influxdb :: An open-source distributed time series database with no external dependencies.

  • http://influxdb.org/
like image 39
A.N. Avatar answered Sep 22 '22 10:09

A.N.