While trying to understand different concepts of InfluxDb I came across this documentation, where there is a comparision of terms with SQL database.
An InfluxDB measurement is similar to an SQL database table.
InfluxDB tags are like indexed columns in an SQL database.
InfluxDB fields are like unindexed columns in an SQL database.
InfluxDB points are similar to SQL rows.
But there are couple of other terminology which I came across, which I could not clearly understand and wondering if there is an SQL equivalent for that.
Series
Bucket
From what I understand from the documentation
series is the collection of data that share a retention policy, measurement, and tag set.
Does this mean a series is a subset of data in a database table? Or is it like database views ?
I could not see any documentation explaining buckets. I guess this is a new concept in 2.0 release
Can someone please clarify these two concepts.
A bucket is a named location where time series data is stored. All buckets have a retention period, a duration of time that each data point persists. InfluxDB drops all points with timestamps older than the bucket's retention period. A bucket belongs to an organization.
In InfluxDB, a series is a collection of points that share a measurement, tag set, and field key.
A bucket is a named location where data is stored that has a retention policy. It's similar to an InfluxDB v1. x “database,” but is a combination of both a database and a retention policy. When using multiple retention policies, each retention policy is treated as is its own bucket.
A bucket is most commonly a type of data buffer or a type of document in which data is divided into regions.
I have summarized my understanding below:
For example, a SQL table workdone
:
Email |
Status |
time |
Completed |
---|---|---|---|
[email protected] | start | 1636775801000000000 | 76 |
[email protected] | finish | 1636775868000000000 | 120 |
[email protected] | start | 1636775801000000000 | 0 |
[email protected] | finish | 1636775868000000000 | 20 |
[email protected] | start | 1636775801000000000 | 54 |
[email protected] | finish | 1636775868000000000 | 56 |
The columns Email
and Status
are indexed.
Hence:
workdone
Email
, Status
Completed
workdone
; Tags: Email
: [email protected]
, Status
: start
; Field: Completed
workdone
; Tags: Email
: [email protected]
, Status
: finish
; Field: Completed
workdone
; Tags: Email
: [email protected]
, Status
: start
; Field: Completed
workdone
; Tags: Email
: [email protected]
, Status
: finish
; Field: Completed
workdone
; Tags: Email
: [email protected]
, Status
: start
; Field: Completed
workdone
; Tags: Email
: [email protected]
, Status
: finish
; Field: Completed
Splitting a logical series across multiple buckets may not improve performance but may complicate flux query as need to include multiple buckets.
According to the InfluxDB glossary:
Bucket
A bucket is a named location where time-series data is stored in InfluxDB 2.0. In InfluxDB 1.8+, each combination of a database and a retention policy (database/retention-policy) represents a bucket. Use the InfluxDB 2.0 API compatibility endpoints included with InfluxDB 1.8+ to interact with buckets.
Series
A logical grouping of data defined by shared measurement, tag set, and field key.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With