I currently have architecture with filebeat as the log shipper, which sends logs to log stash indexer instance and then to managed elastic search in AWS. Due to persistent TCP connections, I cannot load balance using AWS ELB multiple log stash indexer instances since filebeats always picks on of the instances and sends it there. So I decided to use redis. Now seeing how difficult it is to scale redis and make it highly available compontent in ELK stack I want to ask what is even the point of redis. I read a million times it acts as a buffer, but if filebeats stops sending logs to logstash if logstash can't handle the load, why do we even need a buffer. Filebeat is smart enough to know to stop sending logs. Logstash is smart enough to stop sending logs to elastic search if elastic search goes down. So the pipeline stops. I really don't understand of the redis acting as a buffer in every standard ELK architecture.
Unfortunately, when issues occur is precisely the time when all the components in the ELK Stack come under pressure. Message brokers like Redis and Kafka help with dealing with sudden data bursts and to relieve the pressure from downstream components.
Redis is in-memory data structure store, used as database, cache and message broker. Elasticsearch is a modern search and analytics engine based on Apache Lucene. Primary database model. Key-value store.
The ELK Stack helps by providing users with a powerful platform that collects and processes data from multiple data sources, stores that data in one centralized data store that can scale as data grows, and that provides a set of tools to analyze the data.
What is EFK Stack? You might have heard of ELK or EFK stack which has been very popular. It is a set of monitoring tools – Elastic search (object store), Logstash or FluentD (log routing and aggregation), and Kibana for visualization.
Redis or Kafka or XYZ can be used as buffer in the ELK stack as you've rightly noticed.
The ES folks published a blog post yesterday about using Kafka in the pipeline, but it could as well have been Redis or XYZ. They make a good point about WHEN such a buffer could be needed and when it is not.
It is a good idea to have such a buffer in order to
If you don't anticipate such behaviors, i.e. you know
...then you don't need such a buffer. What's more, that will be one less piece of software you need to manage, monitor and maintain.
When it comes to the Elastic Stack ecosystem, there's no one-size-fits-all approach, it always depends on your precise use case and requirements. You need to ask yourself what is important to you, your system(s) and your users and then design your solution accordingly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With