Performing an asynchronous transformation within a Kafka Stream

Tags:

Assume I have two Kafka topics, A and B. I am trying to develop a system that pulls records from A, applies a transformation to each record, then publishes the transformed records to B. In this case, the transformation involves calling a REST endpoint over HTTP.

Being relatively new to Kafka, I was glad to see that the Kafka Streams project already solved this type of problem (consume-transform-publish). Unfortunately, I discovered that transformations in Kafka streams are blocking operations. Instinctively, I try to call HTTP endpoints in a non-blocking, asynchronous manner.

Does this mean that Kafka Streams will not work in this situation? Does this mean that I must revert back to calling the REST endpoint in a blocking manner? Is this even an acceptable pattern for Kafka Streams? Stream-based data processing is still relatively new to me, so I am not entirely familiar with its concurrency models.

441

asked Jun 11 '16 13:06

Adam Paynter

1 Answers

Update: after looking in to this further, I am not sure that this is the right answer...

I am new to Kafka and Kafka Streams (hereafter referred to as "Kafka"), but having encountered and considered similar questions, here is my perspective:

Kafka has two salient features:

All parallelism is achieved through the partitioning of topics
Within a partition of a topic, processing is strongly ordered, one-at-a-time.

Many really nice properties fall out from these features. For example, stream-based "transactions", I think, is one of the coolest.

But whether these properties are actually "features" in the sense that you want them, of course, depends on the application. If you don't want strongly ordered processing with parallelism based on topic partitioning, then you might not want to be using Kafka for that application.

So, with regard to:

Does this mean that Kafka Streams will not work in this situation?

It will work, but increased parallelism is achieved through increased partitioning.

Does this mean that I must revert back to calling the REST endpoint in a blocking manner?

Yes, I think it does—but I'm not sure why that would be a "reversion". Personally, that's what I like about Kafka: blocking code is simpler. If I want more parallelism, I can run more threads. There's no shared state, after all.

answered Oct 03 '22 05:10

Dmitry Minkovsky

Related questions
                            
                                How do I make a block aware execution context?
                            
                                Is an "atomic" interrupt check possible in java?
                            
                                Golang HTTP server requests async or sync?
                            
                                When should you use a mutex over a channel?
                            
                                Postgresql Serializable Transaction not working as expected
                            
                                Why is atomic.StoreUint32 preferred over a normal assignment in sync.Once?
                            
                                Is there a UNIX/pthreads equivalent to Windows manual reset events?
                            
                                BufferedIterator implementation
                            
                                Synchronization of a Queue
                            
                                Why do condition variables sometimes erroneously wake up?
                            
                                Example problems for concurrent computation
                            
                                Spring @Async and Synchronized
                            
                                JPA: How does Read Lock work?
                            
                                Variant of Dekker's algorithm confusion
                            
                                Should one use Disruptor (LMAX) with a big model in memory and CQRS?
                            
                                Can I force an extra run of a scheduled execution?
                            
                                Using the Concurrent Dictionary - Thread Safe Collection Modification
                            
                                Out of order loading in concurrent environment
                            
                                Limit Q promise concurrency in Node js
                            
                                Azure Table Storage - Managing Concurrency

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Performing an asynchronous transformation within a Kafka Stream

Tags:

concurrency

apache-kafka

apache-kafka-streams

Adam Paynter

People also ask

1 Answers

Dmitry Minkovsky

Recent Activity

Donate For Us