Have a kafka cluster from which I consuming two topics and join it. With result of join I do some manipulation with database. All operations to DB is async, so they return me a Future (scala.concurrent.Future, but anyway its the same as java.util.concurrent.CompletableFuture). So as a result I got code like this:
val firstSource: KTable[String, Obj]
val secondSource: KTable[String, Obj2]
def enrich(data: ObjAndObj2): Future[EnrichedObj]
def saveResultToStorage(enrichedData: Future[EnrichedObj]): Future[Unit]
firstSource.leftJoin(secondSource, joinFunc)
.mapValues(enrich)
.foreach(saveResultToStorage)
Is it okay that I manupulate with future values in stream or there are better ways how to handle async tasks (like .mapAsync in Akka streams)?
As point 1 if having just a producer producing message we don't need Kafka Stream. If consumer messages from one Kafka cluster but publish to different Kafka cluster topics. In that case, you can even use Kafka Stream but have to use a separate Producer to publish messages to different clusters.
Introduction. Apache Kafka is the most popular open-source distributed and fault-tolerant stream processing system. Kafka Consumer provides the basic functionalities to handle messages. Kafka Streams also provides real-time stream processing on top of the Kafka Consumer client.
Apache Kafka is an open-source streaming platform that enables the development of applications that ingest a high volume of real-time data. It was originally built by the geniuses at LinkedIn and is now used at Netflix, Pinterest and Airbnb to name a few.
Kafka Streams is a client library for building applications and microservices, where the input and output data are stored in an Apache Kafka® cluster. It combines the simplicity of writing and deploying standard Java and Scala applications on the client side with the benefits of Kafka's server-side cluster technology.
I have this same issue. From what I can tell, Kafka Streams is not designed to handle multi-rate streaming the same way Akka Streams is. Kafka Streams has no equivalent of the multi-rate primitives Akka has like mapAsync, throttle, conflate, buffer, batch, etc. Kafka Streams is good at handling joins between topics and stateful aggregations of data. Akka Streams is good at multi-rate and asynchronous processing.
You have a couple options how to handle this:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With