Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Streaming messages to multiple topics

I have a single master topic and multiple predicates each of which has an output topic associated with it. I want to send each record to ALL topics that whose predicate resolves to true. I am using Luwak to test which predicates a record satisfies (to use this library you evaluate a document with a list of predicates and it tells you which ones matched - i.e. I only call it once to get the list of satisfied predicates).

I am trying to use Kafka Streams for this but there doesn't seem to be the appropriate method on KStream (KStream#branch only routes a record to a single topic).

One possible approach is as follows:

Stream from master
Map the values into a format with the original content and the list of matching predicates
Stream to an intermediate with-matches topic

For each predicate/output topic
    Stream from intermediate with-matches topic
    Filter "does list of matches predicates contain predicate ID"
    Map the values to just the original content
    Stream to corresponding output topic

Such an intermediate topic seems "clunky" though. Any better suggestions?

I am using:

  • Kafka v0.10.1.1
  • Luwak v1.4.0
like image 316
bm1729 Avatar asked Feb 22 '17 10:02

bm1729


People also ask

Can a Kafka producer write to multiple topics?

Kafka is able to seamlessly handle multiple producers that are using many topics or the same topic. The consumer subscribes to one or more topics and reads the messages. The consumer keeps track of which messages it has already consumed by keeping track of the offset of messages.

Can a consumer subscribe to multiple topics?

A. Yes, Kafka's design allows consumers from one consumer group to consume messages from multiple topics.

Can Kafka have multiple topics?

Multi-Topic ConsumersWe may have a consumer group that listens to multiple topics. If they have the same key-partitioning scheme and number of partitions across two topics, we can join data across the two topics.


1 Answers

You can simple apply multiple filters in parallel to the same KStream instance:

KStream stream = ...

stream.filter(new MyPredicate1()).to("output-topic-1");
stream.filter(new MyPredicate2()).to("output-topic-2");
stream.filter(new MyPredicate3()).to("output-topic-3");
// ... as as many as you need

Each record will be sent to each predicate once -- it's conceptually a broadcast to all filters, but records will not be physically replicated, so there is no memory overhead.

like image 84
Matthias J. Sax Avatar answered Oct 06 '22 00:10

Matthias J. Sax