StatsD/Graphite Naming Conventions for Metrics

Tags:

I'm beginning the process of instrumenting a web application, and using StatsD to gather as many relevant metrics as possible. For instance, here are a few examples of the high-level metric names I'm currently using:

http.responseTime
http.status.4xx
http.status.5xx
view.renderTime
oauth.begin.facebook
oauth.complete.facebook
oauth.time.facebook
users.active

...and there are many, many more. What I'm grappling with right now is establishing a consistent hierarchy and set of naming conventions for the various metrics, so that the current ones make sense and that there are logical buckets within which to add future metrics.

My question is two fold:

What relevant metrics are you gathering that you have found indespensible?
What naming structure are you using to categorize metrics?

678

asked Aug 07 '13 15:08

Jared Hanson

1 Answers

This is a question that has no definitive answer but here's how we do it at Datadog (we are a hosted monitoring service so we tend to obsess over these things).

1. Which metrics are indispensable? It depends on the beholder. But at a high-level, for each team, any metric that is as close to their goals as possible (which may not be the easiest to gather).

System metrics (e.g. system load, memory etc.) are trivial to gather but seldom actionable because they are too hard to reliably connect them to a probable cause.

On the other hand number of completed product tours matter to anyone tasked with making sure new users are happy from the first minute they use the product. StatsD makes this kind of stuff trivially easy to collect.

We have also found that the core set of key metrics for any teamchanges as the product evolves so there is a continuous editorial process.

Which in turn means that anyone in the company needs to be able to pick and choose which metrics matter to them. No permissions asked, no friction to get to the data.

2. Naming structure The highest level of hierarchy is the product line or the process. Our web frontend is internally called dogweb so all the metrics from that component are prefixed with dogweb.. The next level of hierarchy is the sub-component, e.g. dogweb.db., dogweb.http., etc. The last level of hierarchy is the thing being measured (e.g. renderTime or responseTime).

The unresolved issue in graphite is the encoding of metric metadata in the metric name (and selection using *, e.g. dogweb.http.browser.*.renderTime) It's clever but can get in the way.

We ended up implementing explicit metadata in our data model, but this is not in statsd/graphite so I will leave the details out. If you want to know more, contact me directly.

answered Oct 07 '22 15:10

Alexis Lê-Quôc

Related questions
                            
                                Find installed version of graphite
                            
                                divide multiple series by each other in grafana
                            
                                Graphite does not graph values correctly when using long durations?
                            
                                How to alert in Seyren with Graphite if transactions in last 60 minutes are less than x?
                            
                                Combine alias functions
                            
                                How do you run utility services on Heroku?
                            
                                Deleted/Empty Graphite Whisper Files Automatically Re-Generating
                            
                                Can Graphite (whisper) metrics be renamed?
                            
                                custom querying in graphite
                            
                                Graphite: sum all stats that match a pattern?
                            
                                Graphite: multiple series with a single command
                            
                                How to use regular expression in fetching data from graphite?
                            
                                How to display grafana graphs in my website's admin panel securely?
                            
                                Kibana 3 Milestone 4 and Graphite Integration
                            
                                Sum multiple metrics without summing over a wildcard?
                            
                                Graphite, datapoints disappear if I choose a wider time range
                            
                                Merge aliasByNode and aliasByMetric in Grafana backed by Graphite
                            
                                nginx 403 Forbidden error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

StatsD/Graphite Naming Conventions for Metrics

Tags:

graphite

statsd

Jared Hanson

People also ask

1 Answers

Alexis Lê-Quôc

Recent Activity

Donate For Us