Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caching time series aggregates of diffs

Is it possible to precalculate (cache) aggregates (min/max/avg) of values which are a difference of two signals?

I have several channels (e.g. 50), with one or more measurements each second, and I can easily store precalculated 1-minute or 15-minute aggregates for faster display.

But one of the requirements is to show a chart of relative values. E.g. if I have channels C1, C2 and C3, user would like to see averages of C1, and averages of (C2 - C3) (or 15-minute minimum/maximum) on a separate chart.

For examples, let's say I have these 2 channels (and 48 more):

t(min)    0    +1   +2   +3   +4   +5   +6   +7   +8   +9   +10
C1       0.0  0.1  0.2 -0.1  0.0  0.1  0.3  0.5  0.7  0.9  0.2
C2       0.1  0.4  0.2  0.1 -0.1  0.5  0.6  0.1  0.2  0.3  0.0

I can precalculate and store 5-minute aggregates:

t(min)    0 to +4    +5 to +10
C1_min     -0.1         0.1
C1_max      0.2         0.9
C2_min     -0.1         0.0
C2_max      0.4         0.6

And easily get 10-min or 15-min aggregates from this.

But if user want's to see min(C2-C1) or max(C2-C1) 5-minute aggregates, for any combination of these 50 channels, it seems that I cannot reuse this information.

In other words, it seems to me that it's impossible to precalculate this, apart from storing each possible combination of these tuples, because min(C2-C3) doesn't equal min(C2)-min(C3).

Am I missing some idea which might help me calculate these values faster?

like image 607
Lou Avatar asked Jan 08 '17 19:01

Lou


1 Answers

You would simply need to have all the data of C2 and C3 to get the aggregation min(C2-C3).

However, if your goal is to minimize the data required to do this calculation, I suggest you do it in the following way (this solution will require dealing with big numbers - depending on the number of channels):

If you know all channels will not have values that exceeds a certain value (let's say it's 10) then we can combine all the channels' data in 1 channel, let's name it C

To calculate C:

C = (C1 * 10^1) + (C2 * 10^2) + (C3 * 10^3) + .. + (Cn * 10^n).

You would end up having a channel C that has all the channels values embedded.

Then to calculate the difference between 2 channels at some point, all you have to do is to "extract" those 2 channels values from C on the fly:

C1 = floor((C mod 10^1) / 10^(1-p)) / 10^p
C2 = floor((C mod 10^2) / 10^(2-p)) / 10^p
...
Cn = floor((C mod 10^n) / 10^(n-p)) / 10^p

Where p is the decimal precision of the extracted channel value.

In this case calculation of diff between two channels x and y using pre-calculated C would be:

min(Cy-Cx) = min((floor((C mod 10^y) / 10^(y-p)) / 10^p) - (floor((C mod 10^x) / 10^(x-p)) / 10^p))

And then you can aggregate those values over intervals of time. Hope it helps.

like image 82
Abed Hawa Avatar answered Oct 05 '22 03:10

Abed Hawa