Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do I need a broker for my production ELK stack + machine specs?

I've recently stood up a test ELK stack Ubuntu box to test the functionality and have been very happy with it. My use case for production would involve ingesting at least 100GB of logs per day. I want to be as scalable as possible, as this 100GB/day can quickly rise as we had more log sources.

I read some articles on ELK production, including the fantasic Logz.io ELK Deployment. While I have a general idea of what I need to do, I am unsure on some core concepts, how many machines I need for such a large amount of data and whether I need a broker like Redis included in my architecture.

What is the point of a broker like Redis? In my test instance, I have multiple log sources sending logs over TCP,syslog, and logstash forwarder to my Logstash directly on my ELK server (which also has Elasticsearch, Nginx, and Kibana installed configured with SSL).

In order to retain a high availability, state of the art production cluster, what machines+specs do I need for at least 100GB of data per day, likely scaling toward 150GB or more in the future? I am planning using my own servers. From what I've researched, the starting point should like something like (assuming I include Redis):

  • 2/3 servers with a Redis+Logstash(indexer) instance for each server. For specs, I am thinking 32GB RAM, fast I/O disk 500GB maybe SSD, 8 cores (i7)
  • 3 servers for Elasticsearch (this is the one I am most unsure about) -- I know I need at least 3 master nodes and 2 data nodes, so 2 servers will have 1 master/1 data each -- these will be beefy 64GB RAM, 20TB, 8 cores. The other remaining master node can be on a low spec machine, as it is not handling data.
  • 2 servers for Nginx/Kibana -- these should be low spec machines, as they are just the web server and UI. Is a load balancer necessary here?

EDIT: Planning on keeping the logs for 60 days.

like image 895
jeffrey Avatar asked May 20 '15 22:05

jeffrey


People also ask

Which three options are components of the Elk stack?

ELK stack components ELK is an acronym for a group of three free and open-source products: Elasticsearch, Logstash, and Kibana: Elasticsearch is the heart of Elastic Stack and is a search and analytics engine that can work with all types of structured, semi structured, and unstructured data.

How can an issue with the Elk stack be detected?

Filebeat: How To Check If It is Running Filebeat runs on your Client machines, and ships logs to your ELK server. If Filebeat isn't running, you won't be able to send your various logs to Logstash. As a result, the logs will not get stored in Elasticsearch, and they will not appear in Kibana.

Are Elk stacks good?

The ELK Stack is popular because it fulfills a need in the log analytics space. As more and more of your IT infrastructure move to public clouds, you need a log management and analytics solution to monitor this infrastructure as well as process any server logs, application logs, and clickstreams.


1 Answers

As for Redis, it acts as a buffer in case logstash and/or elasticsearch are down or slow. If you're using the full logstash or logstash-forwarder as a shipper, it will detect when logstash is unavailable and stop sending logs (remembering where it left off, at least for a while).

So, in a pure logstash/logstash-forwarder environment, I see little reason to use a broker like redis.

When it becomes important is for sources that don't care about logstash's status and don't buffer in their side. syslog, snmptrap, and others fall into this category. Since your sources include syslog, I would bring up brokers in your setup.

Redis is a RAM-intensive app, and that amount of memory that you have will dictate how long of a logstash outage you can withstand. On a 32GB server (shared with logstash), how much of the memory would you give yo redis? How large is your average document size? How many documents would it take to fill the memory? How long does it take to generate that many documents? In my experience, redis fails horribly when the memory fills, but that could just have been me.

Logstash is a CPU-intensive process as all the filters get executed.

As for the size of the elasticsearch cluster, @magnus already pointed you to some information that might help. Starting with 64GB machines is great, and then scale horizontally as needed.

You should have two client (non-data) nodes that are used as the access point for inserts (efficiently dispatching the requests to the correct data node) and searches (handling the 'reduce' phase with data returned from the data nodes). Two of these in a failover config would be a good start.

Two kibana machines will give you redundancy. Putting them in a failover config is also good. nginx was more used with kibana3, I believe. I don't know if people are using it with kibana4 or have moved to 'shield'.

Hope that helps.

like image 159
Alain Collins Avatar answered Sep 24 '22 19:09

Alain Collins