Redis Pubsub and Message Queueing

Tags:

My overall question is: Using Redis for PubSub, what happens to messages when publishers push messages into a channel faster than subscribers are able to read them?

For example, let's say I have:

A simple publisher publishing messages at the rate of 2 msg/sec.
A simple subscriber reading messages at the rate of 1 msg/sec.

My naive assumption would be the subscriber would only see 50% of the messages published onto Redis. To test this theory, I wrote two scripts:

pub.py

queue = redis.StrictRedis(host='localhost', port=6379, db=0) channel = queue.pubsub()  for i in range(10):      queue.publish("test", i)     time.sleep(0.5)

sub.py

r = redis.StrictRedis(host='localhost', port=6379, db=0) p = r.pubsub() p.subscribe('test')  while True:     message = p.get_message()     if message:         print "Subscriber: %s" % message['data']     time.sleep(1)

Results

When I ran sub.py first, immediately followed by pub.py, I found that sub.py actually displayed all the messages (1-10), one after another with a delay of 1 second in between. My initial assumption was wrong, Redis is queuing messages. More tests needed.
When I ran pub.py first, then waited 5 seconds before running sub.py, I found that sub.py only displayed the second half of the messages (5-10). I would have assumed this originally, but given my previous results, I would have thought messages were queued, which led me to the following conclusion...

Conclusions

Redis server appears to queue messages for each client, for each channel.
As long as a client is listening, it doesn't matter how fast it reads messages. As long as it's connected, messages will remain queued for that client, for that channel.

Remaining Questions

Are these conclusions valid?
If so, how long will client/channel messages remained queued?
If so, is there a redis-cli info command to see how many messages are queued (for each client/channel)?

810

asked Jan 02 '15 17:01

Marco Benvoglio

1 Answers

The tests are valid, but the conclusions are partially wrong.

Redis does not queue anything on pub/sub channels. On the contrary, it tends to read the item from the publisher socket, and write the item in all the subscriber sockets, ideally in the same iteration of the event loop. Nothing is kept in Redis data structures.

Now, as you demonstrated, there is still some kind of buffering. It is due to the usage of TCP/IP sockets, and Redis communication buffers.

Sockets have buffers, and of course, TCP comes with some flow control mechanisms. It avoids the loss of data when buffers are full. If a subscriber is not fast enough, data will accumulate in its socket buffer. When it is full, TCP will block the communication and prevents Redis to push more information in the socket.

Redis also manages output communication buffers (on top of the ones of the sockets) to generate data formatted with the Redis protocol. So when the output buffer of the socket is full, the event loop will mark the socket as non writable, and data will remain in Redis output buffers.

Provided the TCP connection is still valid, data can remain in the buffers for a very long time. Now, both the socket and Redis output buffer are bound. If the subscribers are really too slow, and a lot of data accumulate, Redis will ultimately close the connection with subscribers (as a safety mechanism).

By default, for pub/sub, Redis has a soft limit at 8 MB, and a hard limit at 32 MB, per connection buffer. If the output buffer reaches the hard limit, or if it remains between the soft and hard limit for more than 60 seconds, the connection with the slow subscriber will be closed.

Knowing the number of pending messages is not easy. It can be evaluated by looking at the size of the pending information in the socket buffers, and the Redis output buffers.

For Redis output buffers, you can use the CLIENT LIST command (from redis-cli). The size of the output buffer is returned in the obl and oll fields (in bytes).

For socket buffers, there is no Redis command. However, on Linux, it is possible to build a script to interpret the content of the /proc/net/tcp file. See an example here. This script probably needs to be adapted to your system.

172

answered Sep 19 '22 19:09

Didier Spezia

Related questions
                            
                                What does replacement mean in numpy.random.choice?
                            
                                How to use matplotlib tight layout with Figure?
                            
                                Conda install and update do not work also solving environment get errors
                            
                                What is the difference between static files and media files in Django?
                            
                                pep8 warning on regex string in Python, Eclipse
                            
                                check if a file is open in Python
                            
                                Available and used System Memory in Python? [duplicate]
                            
                                Flask app "Restarting with stat"
                            
                                What the difference between using Django redirect and HttpResponseRedirect?
                            
                                Selenium: get coordinates or dimensions of element with Python
                            
                                should I call close() after urllib.urlopen()?
                            
                                Is there a way to instantiate a class without calling __init__?
                            
                                PyDev unittesting: How to capture text logged to a logging.Logger in "Captured Output"
                            
                                Two Python modules require each other's contents - can that work?
                            
                                Which is best in Python: urllib2, PycURL or mechanize?
                            
                                Python 2.7 on Ubuntu
                            
                                Best way to return a value from a python script
                            
                                Split a generator into chunks without pre-walking it
                            
                                wordnet lemmatization and pos tagging in python
                            
                                Python - self, no self and cls

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Redis Pubsub and Message Queueing

Tags:

python

redis

redis-cli

Marco Benvoglio

People also ask

1 Answers

Didier Spezia

Recent Activity

Donate For Us