We have a fairly strict network segmentation policy. I am using a cloud foundry instance to deploy an app to. The firewall rules have been set up to reach the kafka cluster from within the cloud foundry instance. I believe that the firewall rules have also been set up to get to the zookeeper instance as well. I need to actually confirm that one.
My problem seems to be that I can produce messages to kafka, but my consumer doesn't seem to be picking them up. It seems to hang while "polling".
Is there some hidden hosts or ports that I need to deal with for my firewall rules that are not just the standard hosts and ports to the kafka and zookeeper nodes?
By default, the Kafka server is started on port 9092 . Kafka uses ZooKeeper, and hence a ZooKeeper server is also started on port 2181 . If the current default ports don't suit you, you can change either by adding the following in your build.
Network. Kafka uses a binary protocol over TCP. The protocol defines all APIs as request response message pairs.
Kafka uses a binary TCP-based protocol that is optimized for efficiency and relies on a "message set" abstraction that naturally groups messages together to reduce the overhead of the network roundtrip.
If we have 3 Kafka brokers spread across 3 datacenters, then a partition with 3 replicas will never have multiple replicas in the same datacenter. With this configuration, datacenter outages are not significantly different from broker outages.
Kafka and zookeeper are different things. If you are running both on the same machine, you need to open both ports, of corse.
kafka default ports:
zookeeper default ports:
That's it.
Kafka, also has the listeners and advertised.listeners properties which grows some confusion on first users. To make it simple, listener is the network interface your server will bind, and advertised.listeners is the hostname or IP your server will register itself on zookeeper and listen to requests. If you put a hostname in there, your clients WILL have to use the hostname to connect. The advertised.listeners url is the one your clients will use to bootstrap the connection. Once connection is made, your client will get a connection to zookeeper to get other brokers urls. Your producer is not working because of that.
So, to make it work you need to open 2888 on your firewall too, not just 2181. And @Jaya Ananthram is wrong when he tells you that kafka needs 2181 port. It's a zookeeper port. The consumers on kafka 0.10 stills needs to contact zookeeper to persist some things, thats it.
Kafka 0.11.0.0 changed this and is making clients don't need zookeeper at all.
TL;DR : There's no hidden port. Check your broker configuration. Make sure that it advertises IP/PORT that's reachable by Kafka consumers.
I came across this question after experiencing the same problem with Kafka 0.10.1.1 with kafka-python library as a consumer.
No. I captured network traffic and it doesn't use any other port to communicate with Kafka. If the brokers are configured to use 9092, it will be the only port used by consumers.
But upon further investigations, broker configurations were at fault in my case.
kafka.advertised.listeners = PLAINTEXT://[private_ip]:9092,SSL://[public_ip]:9093 kafka.listeners = PLAINTEXT://0.0.0.0:9092,SSL://0.0.0.0:9093
I used [public_ip]:9092 as a bootstrap server because I did not have PKI set up but I wanted to test my consumer from public internet.
The consumer was able to connect to the broker but wasn't able to pull any message.
Since the consumer connected to Kafka using PLAINTEXT, Kafka advertised PLAINTEXT broker addresses instead of SSL addresses. The consumer then tried to reach Kafka brokers using private IP addresses instead of public ones. (as revealed by raw network capture)
After the PKI was enabled and configured in brokers & clients, I was able to pull messages from public internet just fine.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With