I'm looking at adding StatsD data collection to my grails application and looking around at existing libraries and code has left me a little confused as to what would be a good scalable solution. To put the question into context a little I'm working on an online gaming type project where I will naturally be monitoring user interactions with the game engine, these will naturally cluster around particular moments in time where X users will be performing interactions within the window of a second or two, then repeating after a 10-20 second pause.
Here is my analysis of the options that are available today.
https://github.com/etsy/statsd/blob/master/examples/StatsdClient.java
The "simplest thing that could possibly work" solution, I could pull this class into my project and instanciate a singleton instance as a spring bean and use it directly. However after noticing that the grails-statsd plugin creates a pool of client instances I started wondering about the scalability of this approach.
It seems that the doSend
method could become a bottleneck if many threads are trying to send events at the same time, however as I understand it, due to the fire and forget nature of sending UDP packets, this should happen quickly, avoiding the huge overhead that we usually associate with network connections.
https://github.com/charliek/grails-statsd/
Someone has already created a StatsD plugin for grails that includes some nice features, such as the annotations and withTimer
method. However I see that the implementation there is missing some bug fixes from the example implementation such as specifying the locale on calls to String.format
. I'm also not a huge fan of pulling in apache commons-pool just for this, when a standard Executor could achieve a similar effect.
https://github.com/tim-group/java-statsd-client/
This is an alternative pure java library that operates asynchronously by maintaining its own ExecutorService. It supports the entire StatsD API, including sets and sampling, but doesn't provide any hooks for configuring the thread pool and queue size. In the case of problems, for non-critical things such as monitoring, I think I would prefer a finite queue and losing events than having an infinite queue that fills up my heap.
https://github.com/vznet/play-statsd/
Now I can't use this code directly in my grails project but I thought it was worth a look to see how things were implemented. Generally I love the way the code in StatsdClient.scala
is built up, very clean and readable. Also appears to have the locale bug, but otherwise feature complete with the etsy sample. Interestingly, unless there is some scala magic that I've not understood, this appears to create a new socket for each data point that is sent to StatsD. While this approach nicely avoids the necessity for an object pool or executor thread I can't imagine it's terribly efficient, potentially performing DNS lookups within the request thread that should be returning to the user as soon as possible.
So far it looks like the best existing solution is the grails plugin as long as I can accept the commons-pool dependency, but right now I'm seriously considering spending Sunday writing my own version that combines the best parts of each implementation.
Speaking as the primary committer of the java-statsd-client, as well as someone who uses this library in production, I'd like to attempt to allay your fears regarding "having an infinite queue that fills up my heap."
I think you pretty much nailed it with your analysis of the Etsy StatsD client example when you said "due to the fire and forget nature of sending UDP packets, this should happen quickly, avoiding the huge overhead that we usually associate with network connections."
It is my understanding that, the way that the java-statsd-client is currently implemented, the constraint for the build-up of a large queue of outbound messages is the speed of fire-and-forget UDP packet sending. I'm not an expert in this area, but I'm unaware of any way in which this could block such that an infinite queue might build up.
When you originally did your evaluation, there were a number of outstanding issues with the java-statsd-client (e.g. Locale/character encoding ambiguities, and a lack of sampling support), but these have recently been addressed. What remains is the question of whether there is a genuine risk of filling up the heap. I'd be keen to hear thoughts from the community on this matter, and, if the consensus is that there is an issue, I would be delighted to explore the introduction of a limiting queue into the library.
After sleeping on this for a week I think I'm going to go ahead and use the existing grails StatsD plugin. The rationale for this being that although I could achieve a similar effect using an Executor for handling concurrency, without using an object pool this would still be bound to a single client/socket instance, in theory a rather obvious bottleneck in the application. Therefore if I need a pool anyway, I may as well use one where someone else has done all the hard work :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With