I am very intrigued by Redis streams. (Looks like the potential to build little systems powered by append-logs, like Kafka, but without all the overhead of Kafka.)
It looks straightforward to XADD
to a log/stream and to consume an entry from a log/stream. But what about if you want to join across two streams?
Kafka Streams, Flink, Spark, etc. provide means for doing this. Is there an equivalent in the Redis universe?
If not, I guess I'll just need to implement my own thing that consumes from two streams, does its own join logic from the messages, and publishes back out to a new stream. If others have experience doing this with Redis Streams, please do share your pointers or warnings.
Conceptually, a Stream in Redis is a list where you can append entries. Each entry has a unique ID and a value. The ID is auto-generated by default, and it includes a timestamp. The value is a hash. You can query ranges or use blocking commands to read entries as they come.
Redis delivers more than a million read/write operations per second, with sub-millisecond latency on a modestly sized commodity cloud instance, making it extremely resource-efficient for large volumes of data.
A Redis stream is a data structure that acts like an append-only log. You can use streams to record and simultaneously syndicate events in real time. Examples of Redis stream use cases include: Event sourcing (e.g., tracking user actions, clicks, etc.)
The reason is that Redis streams support range queries by ID. Because the ID is related to the time the entry is generated, this gives the ability to query for time ranges basically for free.
If I am correct, you are looking for a way to join two Redis streams.
It seems there is a connector available for Spark, that allows you to consume the streams https://github.com/RedisLabs/spark-redis/blob/master/doc/streaming.md
From here Spark logic for the join should be easy to use.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With