Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between stream processing and message processing

What is the basic difference between stream processing and traditional message processing? As people say that kafka is good choice for stream processing but essentially kafka is a messaging framework similar to ActivMQ, RabbitMQ etc.

Why do we generally not say that ActiveMQ is good for stream processing as well.

Is it the speed at which messages are consumed by the consumer determines if it is a stream?

like image 448
TechEnthusiast Avatar asked Jan 19 '17 14:01

TechEnthusiast


People also ask

What is the difference between message queues vs event streaming platforms?

Once messages are written to a stream, they stay there. Conversely, with message queues or message buses, a message is removed from a queue and then passed to a service's consumer.

What is the meaning of stream processing?

Stream processing is a data management technique that involves ingesting a continuous data stream to quickly analyze, filter, transform or enhance the data in real time. Once processed, the data is passed off to an application, data store or another stream processing engine.

What is the difference between real time and streaming?

Streaming data processing means that the data will be analyzed and that actions will be taken on the data within a short period of time or near real-time, as best as it can. Real-time data processing guarantees that the real-time data will be acted on within a period of time, like milliseconds.

What is stream processing in Kafka?

A stream processing application is any program that makes use of the Kafka Streams library. It defines its computational logic through one or more processor topologies, where a processor topology is a graph of stream processors (nodes) that are connected by streams (edges).


2 Answers

In traditional message processing, you apply simple computations on the messages -- in most cases individually per message.

In stream processing, you apply complex operations on multiple input streams and multiple records (ie, messages) at the same time (like aggregations and joins).

Furthermore, traditional messaging systems cannot go "back in time" -- ie, they automatically delete messages after they got delivered to all subscribed consumers. In contrast, Kafka keeps the messages as it uses a pull-based model (ie, consumers pull data out of Kafka) for a configurable amount of time. This allows consumers to "rewind" and consume messages multiple times -- or if you add a new consumer, it can read the complete history. This makes stream processing possible, because it allows for more complex applications. Furthermore, stream processing is not necessarily about real-time processing -- it's about processing infinite input streams (in contrast to batch processing, which is applied to finite inputs).

And Kafka offers Kafka Connect and Streams API -- so it is a stream-processing platform and not just a messaging/pub-sub system (even if it uses this in its core).

like image 114
Matthias J. Sax Avatar answered Sep 19 '22 15:09

Matthias J. Sax


If you like splitting hairs: Messaging is communication between two or more processes or components whereas streaming is the passing of event log as they occur. Messages carry raw data whereas events contain information about the occurrence of and activity such as an order. So Kafka does both, messaging and streaming. A topic in Kafka can be raw messages or and event log that is normally retained for hours or days. Events can further be aggregated to more complex events.

like image 38
ziad.rida Avatar answered Sep 19 '22 15:09

ziad.rida