Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time Series Databases - Metrics vs. tags

I'm new with TSDB and I have a lot of temperature sensors to store in my database with one point per second. Is it better to use one unique metric per sensor, or only one metric (temperature for example) with distinct tags depending sensor??

I searched on Internet what is the best practice, but I didn't found a good answer...

Thank you! :-)

Edit: I will have 8 types of measurements (temperature, setpoint, energy, power,...) from 2500 sources

like image 411
B 7 Avatar asked Jul 20 '15 11:07

B 7


2 Answers

If you are storing your data in InfluxDB, I would recommend storing all the metrics in a single measurement and using tags to differentiate the sources, rather than creating a measurement per source. The reason being that you can trivially merge or decompose the metrics using tags within a measurement, but it is not possible in the newest InfluxDB to merge or join across measurements.

Ultimately the decision rests with both your choice of TSDB and the queries you care most about running.

like image 61
beckettsean Avatar answered Sep 25 '22 08:09

beckettsean


For comparison purposes, in Axibase Time-Series Database you can store temperature as a metric and sensor id as entity name. ATSD schema has a notion of entity which is the name of system for which the data is being collected. The advantage is more compact storage and the ability to define tags for entities themselves, for example sensor location, sensor type etc. This way you can filter and group results not just by sensor id but also by sensor tags.

To give you an example, in this blog article 0601911 stands for entity id - which is EPA station id. This station collects several environmental metrics and at the same time is described with multiple tags in the database: http://axibase.com/environmental-monitoring-using-big-data/.

The bottom line is that you don't have to stage a second database, typically a relational one, just to store extended information about sensors, servers etc. for advanced reporting.

UPDATE 1: Sample network command:

series e:sensor-001 d:2015-08-03T00:00:00Z m:temperature=42.2 m:humidity=72 m:precipitation=44.3

Tags that describe sensor-001 such as location, type, etc are stored separately, minimizing storage footprint and speeding up queries. If you're collecting energy/power metrics you often have to specify attributes to series such as Status because data may not come clean/verified. You can use series tags for this purpose.

series e:sensor-001 d:2015-08-03T00:00:00Z m:temperature=42.2 ... t:status=Provisional
like image 33
Sergei Rodionov Avatar answered Sep 24 '22 08:09

Sergei Rodionov