Why do I need a broker for my production ELK stack + machine specs?

Tags:

I've recently stood up a test ELK stack Ubuntu box to test the functionality and have been very happy with it. My use case for production would involve ingesting at least 100GB of logs per day. I want to be as scalable as possible, as this 100GB/day can quickly rise as we had more log sources.

I read some articles on ELK production, including the fantasic Logz.io ELK Deployment. While I have a general idea of what I need to do, I am unsure on some core concepts, how many machines I need for such a large amount of data and whether I need a broker like Redis included in my architecture.

What is the point of a broker like Redis? In my test instance, I have multiple log sources sending logs over TCP,syslog, and logstash forwarder to my Logstash directly on my ELK server (which also has Elasticsearch, Nginx, and Kibana installed configured with SSL).

In order to retain a high availability, state of the art production cluster, what machines+specs do I need for at least 100GB of data per day, likely scaling toward 150GB or more in the future? I am planning using my own servers. From what I've researched, the starting point should like something like (assuming I include Redis):

2/3 servers with a Redis+Logstash(indexer) instance for each server. For specs, I am thinking 32GB RAM, fast I/O disk 500GB maybe SSD, 8 cores (i7)
3 servers for Elasticsearch (this is the one I am most unsure about) -- I know I need at least 3 master nodes and 2 data nodes, so 2 servers will have 1 master/1 data each -- these will be beefy 64GB RAM, 20TB, 8 cores. The other remaining master node can be on a low spec machine, as it is not handling data.
2 servers for Nginx/Kibana -- these should be low spec machines, as they are just the web server and UI. Is a load balancer necessary here?

EDIT: Planning on keeping the logs for 60 days.

895

asked May 20 '15 22:05

jeffrey

1 Answers

As for Redis, it acts as a buffer in case logstash and/or elasticsearch are down or slow. If you're using the full logstash or logstash-forwarder as a shipper, it will detect when logstash is unavailable and stop sending logs (remembering where it left off, at least for a while).

So, in a pure logstash/logstash-forwarder environment, I see little reason to use a broker like redis.

When it becomes important is for sources that don't care about logstash's status and don't buffer in their side. syslog, snmptrap, and others fall into this category. Since your sources include syslog, I would bring up brokers in your setup.

Redis is a RAM-intensive app, and that amount of memory that you have will dictate how long of a logstash outage you can withstand. On a 32GB server (shared with logstash), how much of the memory would you give yo redis? How large is your average document size? How many documents would it take to fill the memory? How long does it take to generate that many documents? In my experience, redis fails horribly when the memory fills, but that could just have been me.

Logstash is a CPU-intensive process as all the filters get executed.

As for the size of the elasticsearch cluster, @magnus already pointed you to some information that might help. Starting with 64GB machines is great, and then scale horizontally as needed.

You should have two client (non-data) nodes that are used as the access point for inserts (efficiently dispatching the requests to the correct data node) and searches (handling the 'reduce' phase with data returned from the data nodes). Two of these in a failover config would be a good start.

Two kibana machines will give you redundancy. Putting them in a failover config is also good. nginx was more used with kibana3, I believe. I don't know if people are using it with kibana4 or have moved to 'shield'.

Hope that helps.

159

answered Sep 24 '22 19:09

Alain Collins

Related questions
                            
                                No query registered for []
                            
                                Port issues with Vagrant and Elasticsearch
                            
                                Kibana, filter on count greater than or equal to X
                            
                                How to configure Spring Boot with elasticsearch 5.2.1?
                            
                                Elasticsearch Rest Client Still Giving IOException : Too Many Open Files
                            
                                How To Push a Spark Dataframe to Elastic Search (Pyspark)
                            
                                PostgreSQL + Elasticsearch synchronization in JAVA spring (JPA)
                            
                                version_conflict_engine_exception with multiple _update_by_query
                            
                                Content-type header not supported
                            
                                How to get around "connection reset by peer" when using Elasticsearch's RestClient
                            
                                ElasticSearch Delete Query - Filter with term and range
                            
                                ElasticSearch Bool Filter with a Phrase (instead of a single word/tag)
                            
                                elasticsearch client thread safety
                            
                                NEST (elasticsearch) Highlighting in multiple fields
                            
                                How to define a mapping in elasticsearch that doesn't accept fields other that the mapped ones?
                            
                                how to use sincedb in logstash?
                            
                                Unique count of terms aggregations
                            
                                ElasticSearch QueryParsingException failed to find geo_point field
                            
                                Spring Data ElasticSearch TransportClient Java Config
                            
                                Kibana 4 , making pie chart , error message

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do I need a broker for my production ELK stack + machine specs?

Tags:

redis

elasticsearch

logstash

kibana

jeffrey

People also ask

1 Answers

Alain Collins

Recent Activity

Donate For Us