Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Centralized network logging - syslog and alternatives? [closed]

At work, we're building a distributed application (possibly across several machines on a LAN, possibly later across several continents on a WAN+VPN). We don't want log files local to each machine (filling up its disk and impossible to view in aggregate) so we need to centralize logging over the network. Most logs won't be important, so UDP is fine for them, but some are money-losing important alerts and must be reliably delivered, implying TCP. We're worried about congesting the network if the logging protocol is too chatty, or dragging the apps to a crawl if it isn't responsive.

Some possibilities I've considered are:

  • syslog (it seems perfect, but my boss has an animus against this so I may not be able to choose it).
  • scribe from facebook (but it seems a bit heavyweight with a server on every machine - not every log message needs ultra-reliability).
  • using a message queue like rabbitmq which can have multiple queues tuned to different levels of transaction safety.
  • worst case, I can write my own from scratch.

Do you have other suggestions? What centralized logging solutions have you used, and how well did they work out?

Edit: I was leaning towards scribe, because its store-and-forward design decouples the running app from network latency. But after struggling to install it, I found that (1) it's not available as a binary package - nowadays that's unforgivable - and (2) it depends intimately on a library (thrift) that isn't available as a binary package either! And worst of all, it wouldn't even compile properly. That is not release quality code, even in open source.

like image 206
Julian Morrison Avatar asked Nov 27 '09 15:11

Julian Morrison


4 Answers

We have successfully used ZeroMQ for the logs of a distributed application scenario like the yours. It's very reliable and incredibly fast. We moved to ZeroMQ after a not-so-successful implementation with Spread. In our setup a single ZeroMQ server is able to handle more than 70 different logs from a medium to high busy distributed application. It receives data from LAN and via the Internet.

If you need a detailed queue server comparison, look at this page from the Second Life wiki.

Hope it helps!

like image 120
Ass3mbler Avatar answered Oct 14 '22 10:10

Ass3mbler


There're several alternatives recently. Notably, Scribe is not maintained any more. Facebook developed its successor called Caligraphus, and it's not open-sourced. Here's a list of alternatives.

  • syslog: installed at all Linux distributions
  • Fluentd: C+Ruby-based lightweight logger, which handles logs as JSON stream
  • Flume: developed at Cloudera, written in Java and works well with Hadoop ecosystems
  • Apache Kafka: developed at LinkedIn, pull-based architecture
  • Scribe: open-sourced by Facebook, but not maintained anymore

Disclaimer: I'm a committer of Fluentd project.

like image 22
Kazuki Ohta Avatar answered Oct 14 '22 10:10

Kazuki Ohta


The other examples might be great, but I've had good luck with Syslog-NG. It is extremely flexible and configurable; although it's pretty easy to pick it up and do something useful with quickly.

like image 35
JasonSmith Avatar answered Oct 14 '22 10:10

JasonSmith


Syslog is good if you intend to focus only on infrastructure logs (e.g. on system level). I heard that KIWI Syslog Server is a good one, though didn't try it myself. On the other hand, if you want to log an application related stuff, a syslog is perhaps not the best option for this. In case you use apache logging services (log4j, log4xxx and the rest), then logFaces would be a good solution as it's built particularly for aggregating multiple applications in one place. Works with both TCP or UDP connections and has decent log viewer and database integration.

Disclosure: I am the author of this product.

like image 27
Dima Avatar answered Oct 14 '22 12:10

Dima