Suppose I've got a ton (a continuous stream) of requests to process, and each request has several stages. For example: "connecting to data source", "reading data from data source", "validating data", "processing data", "connecting to data sink", "writing result to data sink".
Which visualization methods or even tools fit well to visualize the behavior of such a system?
I'd like to be able to see which stages are taking a long time, and how the stages of different requests are aligned with respect to each other (for example, to see that the data source responds longer when accessed by too many requests at once).
If there were just a few dozen requests, I'd be OK with a few dozen individual colored timelines, but for a few thousand that doesn't fit well. I think I can get away with N colored timelines, where N is the "concurrency factor", but 1) perhaps there's something better, 2) perhaps there exist tools for this?
P.S. Shameless plug: Once I figure out the best way of visualization, I'll add it to my nifty tool called timeplot ;)
P.P.S. Another shameless plug: I decided to write a separate tool: splot. Here's what it can do, based on a trivially simple log and an awk one-liner:
It's showing 160 cores of a cluster performing tasks fed to them by RabbitMQ. Blue is "fetching data", orange is "computing", white is "doing nothing". Several problems are immediately obvious from this diagram, which would be very hard to find by just looking at the logs.
I have a multi process software that runs on a machine with 15 cores. Here is what I do.
Log all messages to syslog. Finally plot (selected) last 20 minutes log data on http://www.simile-widgets.org/timeline. To keep an eye on what is getting logged when and the patterns I use syslog viewer. There are plenty you can find the one that suits you. http://www.google.com/search?aq=0&oq=syslog+vi&sourceid=chrome&ie=UTF-8&q=syslog+viewer
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With