Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis Pubsub and Message Queueing

My overall question is: Using Redis for PubSub, what happens to messages when publishers push messages into a channel faster than subscribers are able to read them?

For example, let's say I have:

  • A simple publisher publishing messages at the rate of 2 msg/sec.
  • A simple subscriber reading messages at the rate of 1 msg/sec.

My naive assumption would be the subscriber would only see 50% of the messages published onto Redis. To test this theory, I wrote two scripts:

pub.py

queue = redis.StrictRedis(host='localhost', port=6379, db=0) channel = queue.pubsub()  for i in range(10):      queue.publish("test", i)     time.sleep(0.5) 

sub.py

r = redis.StrictRedis(host='localhost', port=6379, db=0) p = r.pubsub() p.subscribe('test')  while True:     message = p.get_message()     if message:         print "Subscriber: %s" % message['data']     time.sleep(1) 

Results

  • When I ran sub.py first, immediately followed by pub.py, I found that sub.py actually displayed all the messages (1-10), one after another with a delay of 1 second in between. My initial assumption was wrong, Redis is queuing messages. More tests needed.
  • When I ran pub.py first, then waited 5 seconds before running sub.py, I found that sub.py only displayed the second half of the messages (5-10). I would have assumed this originally, but given my previous results, I would have thought messages were queued, which led me to the following conclusion...

Conclusions

  • Redis server appears to queue messages for each client, for each channel.
  • As long as a client is listening, it doesn't matter how fast it reads messages. As long as it's connected, messages will remain queued for that client, for that channel.

Remaining Questions

  • Are these conclusions valid?
  • If so, how long will client/channel messages remained queued?
  • If so, is there a redis-cli info command to see how many messages are queued (for each client/channel)?
like image 810
Marco Benvoglio Avatar asked Jan 02 '15 17:01

Marco Benvoglio


People also ask

Can Redis be used as a message queue?

Redis Lists and Redis Sorted Sets are the basis for implementing message queues. They can be used both directly to build bespoke solutions, or via a framework that makes message processing more idiomatic for your programming language of choice.

What is difference between Pubsub and message queue?

In a message-queue model, the publisher pushes messages to a queue where each subscriber can listen to a particular queue. In the event-stream model using Pub/Sub, the publisher pushes messages to a topic that multiple subscribers can listen to.

Is Redis good for Pubsub?

Aside from data storage, Redis can be used as a Publisher/Subscriber platform. In this pattern, publishers can issue messages to any number of subscribers on a channel. These messages are fire-and-forget, in that if a message is published and no subscribers exists, the message evaporates and cannot be recovered.

Is Redis cache or message queue?

Redis or REmote DIctionary Server is an advanced NoSQL key-value data store used as a cache, database, and message broker. It provides tools like Redis message queue for message broking. It is known for its rich data types, fast read and writes operations, and advanced memory structure.


1 Answers

The tests are valid, but the conclusions are partially wrong.

Redis does not queue anything on pub/sub channels. On the contrary, it tends to read the item from the publisher socket, and write the item in all the subscriber sockets, ideally in the same iteration of the event loop. Nothing is kept in Redis data structures.

Now, as you demonstrated, there is still some kind of buffering. It is due to the usage of TCP/IP sockets, and Redis communication buffers.

Sockets have buffers, and of course, TCP comes with some flow control mechanisms. It avoids the loss of data when buffers are full. If a subscriber is not fast enough, data will accumulate in its socket buffer. When it is full, TCP will block the communication and prevents Redis to push more information in the socket.

Redis also manages output communication buffers (on top of the ones of the sockets) to generate data formatted with the Redis protocol. So when the output buffer of the socket is full, the event loop will mark the socket as non writable, and data will remain in Redis output buffers.

Provided the TCP connection is still valid, data can remain in the buffers for a very long time. Now, both the socket and Redis output buffer are bound. If the subscribers are really too slow, and a lot of data accumulate, Redis will ultimately close the connection with subscribers (as a safety mechanism).

By default, for pub/sub, Redis has a soft limit at 8 MB, and a hard limit at 32 MB, per connection buffer. If the output buffer reaches the hard limit, or if it remains between the soft and hard limit for more than 60 seconds, the connection with the slow subscriber will be closed.

Knowing the number of pending messages is not easy. It can be evaluated by looking at the size of the pending information in the socket buffers, and the Redis output buffers.

For Redis output buffers, you can use the CLIENT LIST command (from redis-cli). The size of the output buffer is returned in the obl and oll fields (in bytes).

For socket buffers, there is no Redis command. However, on Linux, it is possible to build a script to interpret the content of the /proc/net/tcp file. See an example here. This script probably needs to be adapted to your system.

like image 172
Didier Spezia Avatar answered Sep 19 '22 19:09

Didier Spezia