I am looking to hack together a kafka consumer in Python or R (preferably R). Using the kafka console consumer I can grep for a string and retrieve the relevant data but I am at a loss when it comes to parsing it suitably in R.
There are kafka clients available in other languages (for example: PHP, CPP) but one in R would be helpful from a data analytics point of view.
It would be great if the expert R developers on this forum could hint at/suggest resources that would allow me to make headway in this direction.
Apache Kafka : incubator.apache.org/kafka/
Kafka Consumer Client(s) : https://github.com/kafka-dev/kafka/tree/master/clients
Step 1: Start the zookeeper as well as the kafka server initially. Step2: Type the command: 'kafka-console-consumer' on the command line. This will help the user to read the data from the Kafka topic and output it to the standard outputs. Video Player is loading.
Kafka consumers act as end-users or applications that retrieve data from Kafka servers inside which Kafka producers publish real-time messages. For effectively fetching real-time messages, Kafka consumers have to subscribe to the respective topics present inside the Kafka servers.
Get the list of consumer groups for a topic. Use kafka-consumer-groups.sh to list all consumer groups. Note that the below command will list all the consumer groups for all topics managed by the cluster.
Run Kafka Consumer Console Kafka provides the utility kafka-console-consumer.sh which is located at ~/kafka-training/kafka/bin/kafka-console-producer.sh to receive messages from a topic on the command line. Create the file in ~/kafka-training/lab1/start-consumer-console.sh and run it.
[2015 Update] there is a library that allows you to connect to kafka - rkafka
http://cran.r-project.org/web/packages/rkafka/rkafka.pdf
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With