Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kafka producer to read data files

I am trying to load a data file in loop(to check stats) instead of standard input in Kafka. After downloading Kafka, I performed the following steps:

Started zookeeper:

bin/zookeeper-server-start.sh config/zookeeper.properties

Started Server:

bin/kafka-server-start.sh config/server.properties

Created a topic named "test":

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

Ran the Producer:

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test 
Test1
Test2

Listened by the Consumer:

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
Test1
Test2

Instead of Standard input, I want to pass a data file to the Producer which can be seen directly by the Consumer. Or is there any kafka producer instead of console consumer using which I can read data files. Any help would really be appreciated. Thanks!

like image 507
Sumit Avatar asked Feb 13 '16 08:02

Sumit


2 Answers

You can read data file via cat and pipeline it to kafka-console-producer.sh.

cat ${datafile} | ${kafka_home}/bin/kafka-console-producer.sh --broker-list ${brokerlist} --topic test 
like image 96
Shawn Guo Avatar answered Nov 02 '22 23:11

Shawn Guo


If there is always a single file, you can just use tail command and then pipeline it to kafka console producer.

But if a new file will be created when some conditions met, you may need use apache.commons.io.monitor to monitor new file created, then repeat above.

like image 24
Gang Sun Avatar answered Nov 02 '22 23:11

Gang Sun