I'm trying to setup a basic Java consumer to receive messages from a Kafka topic. I've followed the sample at - https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example - and have this code:
package org.example.kafka.client;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Properties;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import kafka.consumer.ConsumerConfig;
import kafka.consumer.KafkaStream;
import kafka.javaapi.consumer.ConsumerConnector;
public class KafkaClientMain
{
private final ConsumerConnector consumer;
private final String topic;
private ExecutorService executor;
public KafkaClientMain(String a_zookeeper, String a_groupId, String a_topic)
{
this.consumer = kafka.consumer.Consumer.createJavaConsumerConnector(
createConsumerConfig(a_zookeeper, a_groupId));
this.topic = a_topic;
}
private static ConsumerConfig createConsumerConfig(String a_zookeeper, String a_groupId) {
Properties props = new Properties();
props.put("zookeeper.connect", a_zookeeper);
props.put("group.id", a_groupId);
props.put("zookeeper.session.timeout.ms", "1000");
props.put("zookeeper.sync.time.ms", "1000");
props.put("auto.commit.interval.ms", "1000");
props.put("auto.offset.reset", "smallest");
return new ConsumerConfig(props);
}
public void shutdown() {
if (consumer != null) consumer.shutdown();
if (executor != null) executor.shutdown();
}
public void run(int a_numThreads) {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topic, new Integer(a_numThreads));
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic);
System.out.println( "streams.size = " + streams.size() );
// now launch all the threads
//
executor = Executors.newFixedThreadPool(a_numThreads);
// now create an object to consume the messages
//
int threadNumber = 0;
for (final KafkaStream stream : streams) {
executor.submit(new ConsumerTest(stream, threadNumber));
threadNumber++;
}
}
public static void main(String[] args)
{
String zooKeeper = "ec2-whatever.compute-1.amazonaws.com:2181";
String groupId = "group1";
String topic = "test";
int threads = 1;
KafkaClientMain example = new KafkaClientMain(zooKeeper, groupId, topic);
example.run(threads);
}
}
and
package org.example.kafka.client;
import kafka.consumer.ConsumerIterator;
import kafka.consumer.KafkaStream;
public class ConsumerTest implements Runnable
{
private KafkaStream m_stream;
private int m_threadNumber;
public ConsumerTest(KafkaStream a_stream, int a_threadNumber)
{
m_threadNumber = a_threadNumber;
m_stream = a_stream;
}
public void run()
{
System.out.println( "calling ConsumerTest.run()" );
ConsumerIterator<byte[], byte[]> it = m_stream.iterator();
while (it.hasNext())
{
System.out.println("Thread " + m_threadNumber + ": " + new String(it.next().message()));
}
System.out.println("Shutting down Thread: " + m_threadNumber);
}
}
Kafka is running on the EC2 host in question, and I can send and receive messages on the topic "test" using the kafka-console-producer.sh and kafka-console-consumer.sh tools. Port 2181 is open and available from the machine where the consumer is running (and so is 9092 for good measure, but that didn't seem to help either).
Unfortunately, I never receive any messages in my consumer when I run this. Neither existing messages on the topic, nor newly sent messages that I send using kafka-console-producer.sh, while the consumer is running.
This is using Kafka 0.8.1.1 running on CentOS 6.4 x64, using OpenJDK 1.7.0_65.
Edit: FWIW, when the consumer program starts, I see this Zookeeper output:
[2014-08-01 15:56:38,045] INFO Accepted socket connection from /98.101.159.194:24218 (org.apache.zookeeper.server.NIOServerCnxn)
[2014-08-01 15:56:38,049] INFO Client attempting to establish new session at /98.101.159.194:24218 (org.apache.zookeeper.server.NIOServerCnxn)
[2014-08-01 15:56:38,053] INFO Established session 0x1478e963fb30008 with negotiated timeout 6000 for client /98.101.159.194:24218 (org.apache.zookeeper.server.NIOServerCnxn)
Any idea what might be going on with this? Any and all help is much appreciated.
Answering this myself for posterity, in case anybody else runs across a similar problem.
The issue was this: The Kafka broker and Zookeeper were on an EC2 node, and the consumer was on my laptop running locally. When connecting to Zookeeper, the client was getting handed a reference to "ip-10-0-x-x.ec2.internal", which does not resolve (by default) from outside of EC2. This became clear once I properly configured log4j on the client so I was getting all of the log messages.
The workaround was to just put an entry in my /etc/hosts file, mapping the ec2 internal hostname to the publicly routable IP address.
You can solve this problem using setting following property in server.properties file located under kafka config folder
advertised.host.name=public dns of Ec2 server
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With