Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Connection reset on large MGET requests

Tags:

python

redis

When making large MGET requests to Redis (>2,000,000 arguments) using redis-py, I get the following socket error:

ConnectionError: Error 104 while writing to socket. Connection reset by peer.

I've tried this from different clients, but the issue remains. I read here that there was possibly a window scaling bug going on, so I tried adjusting net.ipv4.tcp_wmem and net.ipv4.tcp_rmem to have a smaller maximum window, but this didn't work either. I'm running this on Python 2.7.3, Ubuntu 12.04.1 LTS, and Redis 2.6.4.

like image 958
thyme Avatar asked Mar 23 '23 23:03

thyme


1 Answers

You cannot retrieve such a number of values with a single MGET. This command is not designed to sustain such workload. It is a wrong idea to generate very large Redis commands:

  • On server side, all the command should fit in the input buffer. All the result of the command should fit in the output buffer. The input buffer is limited to 1 GB. For the output buffer there are soft and hard limits depending on the nature of the client. But growing the buffers close to these limits is really looking for troubles. Redis simply closes the connection when the limits are reached.

  • On client side, there are probably also similar buffers and hard-coded limits.

  • Redis is a single-threaded event loop. The execution of commands is serialized. So a very large command will make Redis unresponsive for all the other clients.

Should you want to retrieve massive amount of data, you are supposed to pipeline several GET or MGET commands. For example, the following code can be used to retrieve an arbitrary number of items while minimizing the number of roundtrips and server side CPU consumption:

import redis

N_PIPE = 50 # number of MGET commands per pipeline execution
N_MGET = 20 # number of keys per MGET command

# Return a dictionary from the input array containing the keys
def massive_get( r, array ):
    res = {}
    pipe = r.pipeline(transaction=False)
    i = 0
    while i < len(array):
        keys = []
        for n in range(0,N_PIPE):
            k = array[i:i+N_MGET]
            keys.append( k )
            pipe.mget( k )
            i += N_MGET
            if i>=len(array):
                break
        for k,v in zip( keys, pipe.execute() ):
            res.update( dict(zip(k,v)) )
    return res

# Example: retrieve all keys from 0 to 1022:
pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
r = redis.Redis(connection_pool=pool)
array = range(0,1023)
print massive_get(r,array)
like image 74
Didier Spezia Avatar answered Apr 10 '23 06:04

Didier Spezia