I am working on a solution for centralized log file aggregation from our CentOs 6.x servers. After installing Elasticsearch/Logstash/Kibana (ELK) stack I came across an Rsyslog omelasticsearch plugin which can send messages from Rsyslog to Elasticsearch in logstash format and started asking myself why I need Logstash.
Logstash has a lot of different input plugins including the one accepting Rsyslog messages. Is there a reason why I would use Logstash for my use case where I need to gather the content of logs files from multiple servers? Also, is there a benefit of sending messages from Rsyslog to Logstash instead of sending them directly to Elasticsearch?
I would use Logstash in the middle if there's something I need from it that rsyslog doesn't have. For example, getting GeoIP from an IP address.
If, on the other hand, I would need to get syslog or file contents indexed in Elasticsearch, I'd use rsyslog directly. It can do buffering (disk+memory), filtering, you can choose how the document will look like (you can put the textual severity instead of the number, for example), and it can parse unstructured data. But the main advantage is performance, on which rsyslog is focused on. Here's a presentation with some numbers (and tips and tricks) on Logstash, rsyslog and Elasticsearch: http://blog.sematext.com/2015/05/18/tuning-elasticsearch-indexing-pipeline-for-logs/
I would recommend logstash. That would be easier to setup, more examples and they are tested to fit together.
Also, there are some benefits, in logstash you can filter and modify your logs.
Moreover, you can setup batch size to optimize saving to elastic. Another feature, if something went wrong and there are crazy amount of logs per second that elastic can not process, you can setup logstash that it would save some queue of events or drop events that can not be saved.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With