Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Schema design in influxdb

My use case for influxDB is for storing and trending process data coming from different PLCs. I visualize this data using grafana. In a first pilot, I used the schema design guidelines from influxDB, using a generic measurement name and separating the different value sources by means of tags.

For example, when I have 2 pumps in the 'acid' pump group and 2 pumps in the 'caustic' pump group of which I recond the pressure:

- pump_pressure {pump: pump_1, group: acid} 
- pump_pressure {pump: pump_2, group: acid} 
- pump_pressure {pump: pump_1, group: caustic} 
- pump_pressure {pump: pump_2, group: caustic} 

In my use case, the end-user wants to be able to make their own trends using Grafana for example. While this way of recording the data is conform the schema design guidelines of influxDB (I think), it is very confusing for non technical people that are not used to working with and thinking in SQL like languages.

Therefore, I'm tempted to store the data in the way that they are used to, and is the general way of working in similar products (historians):

- ACID_pump_1_pressure
- ACID_pump_2_pressure
- CAUSTIC_pump_1_pressure
- CAUSTIC_pump_2_pressure

This would make it much easier for the end user to make trends, as 1 measurement = one data source, and they don't have to worry about where and group by clauses.

Can anyone point me to some clues what the impact of the latter would be on influxDB performance and storage. Will the data take more space in this way? Please not that the latter method can lead to a few thousand measurement, but their cardinality would all be 1.

like image 443
coussej Avatar asked May 12 '16 09:05

coussej


People also ask

Is there a UI for InfluxDB?

The InfluxDB user interface (UI) provides tools for building custom dashboards to visualize your data.

What is difference between bucket and database in InfluxDB?

A bucket is a named location where data is stored that has a retention policy. It's similar to an InfluxDB v1. x “database,” but is a combination of both a database and a retention policy. When using multiple retention policies, each retention policy is treated as is its own bucket.

Does InfluxDB have tables?

InfluxDB is TimeSeries database, it does not support tables, foreign keys and other relational entities.

Is InfluxDB a columnar database?

Inside InfluxDB's new time series engineThe data structure is what is known a columnar database, now designed for time series data. "It's built for real-time workloads, which means you write data and within milliseconds of writing the data is available for queries," Dix said.


2 Answers

There is no reason you can't do that if it fits your use-case better. The guidelines that you start with are there because it unlocks the full power of InfluxDB's tagging capability.

There will be no performance or storage implications. Internally, InfluxDB creates a new series based on each unique measurement "key", where the key is the combination of measurement name and tag key/value pairs.

ie, each of these is a separate series:

pump_pressure,pump=pump_1,group=acid
pump_pressure,pump=pump_2,group=acid
pump_pressure,pump=pump_1,group=caustic
pump_pressure,pump=pump_2,group=caustic

also, each of these is a separate series:

ACID_pump_1_pressure
ACID_pump_2_pressure
CAUSTIC_pump_1_pressure
CAUSTIC_pump_2_pressure

EDIT, source: I work at InfluxData

EDIT 2, this being said, I also agree fully with @srikanta and I would recommend keeping the tags, but finding another solution to interacting with the users of the db (or educating).

like image 200
Cameron Sparr Avatar answered Nov 02 '22 22:11

Cameron Sparr


Indeed you can go with this approach. However this is not scalable. What if the number of pumps used increases ? Then too, this approach works where the number of pumps is equal to the number of time series. However it becomes a pain to manage.

If the problem to avoid the interaction of the non technical user with the SQL queries then different approach to that should be considered and not to alter the "schema" of the database.

Some more insights --> https://blog.zhaw.ch/icclab/influxdb-design-guidelines-to-avoid-performance-issues/

like image 33
Srikanta Avatar answered Nov 03 '22 00:11

Srikanta