Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Graphite, datapoints disappear if I choose a wider time range

Tags:

graphite

If I ask for this data:

https://graphite.it.daliaresearch.com/render?from=-2hours&until=now&target=my.key&format=json

I get, among other datapoints, this one:

[
  2867588,
  1398790800
]

If I ask for this data:

https://graphite.it.daliaresearch.com/render?from=-10hours&until=now&target=my.key&format=json

The datapoint looks like this:

[
  null,
  1398790800
]

Why this datapoint is being nullified when I choose a wider time range?

Update

I'm seeing that for a chosen date range smaller than 7 hours the resolution of the datapoints are every 10 seconds and when the date range chosen is 7 hours or bigger the the resolution goes to one datapoint every 1 minute.. and continue this diretion as the date range chosen is getting bigger to one datapoint every 10 minutes and so.

So when the resolution of the datapoints is every 10 seconds the data is there, when the resolution is every 1 minute or more, then the datapoint has not the value :/

I'm sending a data point every 1 hour, maybe it is a conflict with the resolutions configuration and me sending only one datapoint per hour

like image 886
fguillen Avatar asked Apr 29 '14 18:04

fguillen


2 Answers

There are several things happening here, but basically the problem is that you have misconfigured graphite (or at least, configured it in a way that makes it do things that you aren't expecting!)

Specifically, you should set xFilesFactor = 0.0 in your storage-aggregation.conf file. Since you are new at this, you probably just want this (mine is in /opt/graphite/conf/storage-aggregation.conf):

[default]
pattern = .*
xFilesFactor = 0.0
aggregationMethod = average

The graphite docs describe xFilesFactor like this:

xFilesFactor should be a floating point number between 0 and 1, and specifies what fraction of the previous retention level’s slots must have non-null values in order to aggregate to a non-null value. The default is 0.5.

But wait! This wont change existing statistics! These aggregation settings are set once per metric at the time the metric is created. Since you are new at this, the easy way out is to just go to your whisper directory and delete the prior data and start over:

cd /opt/graphite/storage/whisper/my/
rm key.wsp

your root whisper directory may be different depending on platform, etc. After removing the data files graphite should recreate them automatically upon the next metric write and they should get your updated settings (dont forget to restart carbon-cache after changing your storage-aggregation settings).

Alternatively, if you need to keep your old data you will need to run whisper-resize.py against your whisper (.wsp) data files with --xFilesFactor=0.0 and also likely all of your retention settings from storage-schemas.conf (also viewable with whisper-info.py)

Finally, I should add that the reason you get non-null data in your first query, but null data in your second is because graphite will try to pick the best available retention period from which to serve your request based on the time window you requested. For the smaller window, graphite is deciding that it can serve your request using the highest precision data (i.e., non aggregated) and so you are seeing your raw metrics. For the longer time window, graphite is finding that the high precision, non-aggregated data is not available for the entire window -- these periods are configured in storage-schemas.conf -- so it skips to the next highest-precision data set available (i.e. first aggregation tier) and returns only aggregated data. Because your aggregation config is writing null data, you are therefore seeing null metrics! So fix the aggregation, and you should fix the null data problem. But remember that graphite never combines aggregation tiers in a single request/response, so anytime you see differences between results from the same query when all you are changing is the from / to params, the problem is pretty much always due to aggregation configs.

like image 107
dpkp Avatar answered Nov 13 '22 23:11

dpkp


I'm not quite sure about your specific situation, but I think I can give you some general pointers.

First off, you are right about the changing resolution depending on the time range. This is configured in storage-schemas.conf and is done to save space when storing data over large periods of time. An example could be: 15s:7d,1m:21d,15m:5y, meaning 15 seconds resolution for 7 days, then 1 minute resolution for 21 days, then 15min for 5 years.

Then there is the way Graphite does the actual aggregation from one resolution to the other. This is configured in: storage-aggregation.conf. The default settings are: xFilesFactor=0.5 and aggregationMethod=average. The xFilesFactor setting is saying that a minimum of 50% of the slots in the previous retention level must have values for next retention level to contain an aggregate. The aggregationMethod is saying that all the values of the slots in the previous retention level will be combined by averaging. My guess is that your stat doesn't have enough data points to fulfill the 50% requirement, resulting in a null value.

For more information, check out the docs, they are pretty complete: http://graphite.readthedocs.org/en/latest/config-carbon.html

like image 40
zeebonk Avatar answered Nov 14 '22 00:11

zeebonk