Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka Memory requirement

I am a beginner to kafka

We are looking for sizing our kafka cluster(a 5 node cluster) for processing 17,000 events/sec with each event at the size of 600bytes. We are planning a replication of 3 and retention of events for a week

I read in the kafka documentation page

    assuming you want to be able to buffer for 30 seconds and 
compute your memory need as write_throughput*30.

so what is this write throughout? if its is number of MB per second - I am looking at 9960MB/Sec

if consider that as my write throughput then the memory calculates as 292GB(9960MB/Sec * 30 )

so what is the 292GB represent Memory requirement for one node or the entire cluster(5 nodes)

I would really appreciate some insights on the sizing of memory and disk.

Regards VB

like image 686
Suren Baskaran Avatar asked Jul 08 '15 19:07

Suren Baskaran


People also ask

How much memory does Kafka require?

Furthermore, Kafka uses heap space very carefully and does not require setting heap sizes more than 6 GB. This will result in a file system cache of up to 28-30 GB on a 32 GB machine. You need sufficient memory to buffer active readers and writers.

Is Kafka memory or CPU intensive?

Most Kafka deployments tend to be rather light on CPU requirements. As such, the exact processor setup matters less than the other resources. Note that if SSL is enabled, the CPU requirements can be significantly higher (the exact details depend on the CPU type and JVM implementation).

Does Kafka use memory?

Kafka avoids Random Access Memory, it achieves low latency message delivery through Sequential I/O and Zero Copy Principle. Sequential I/O: Kafka relies heavily on the filesystem for storing and caching messages.

Does Kafka need SSD?

Should I use SSDs for my Kafka Brokers? Using SSDs instead of spinning disks has not been shown to provide a significant performance improvement for Kafka, for two main reasons: Kafka writes to disk are asynchronous.


1 Answers

If your message size is 600 bytes with 17k msg/s, then your throughput would be ~10MB/s [17000*600/(1024*1024)]. If you're partitioning the topic and using the 5 brokers, with 3 replicas that would be 10/5*3 = 6MB/s per broker you'd need for buffering which shouldn't be a problem on any normal hardware. Buffering 30s would mean 180MB of memory.

In the case that you meant a message size of 600kB, then you'd need to look at adding plenty of very fast storage to reach 6GB/s and actually it would be better to increase the number of nodes of the cluster instead.

like image 86
Lundahl Avatar answered Sep 23 '22 17:09

Lundahl