It's not clear to me how connections pools work, and how to properly use them. I was hoping someone could elaborate. I've sketched out my use case below:
settings.py:
import redis  def get_redis_connection():     return redis.StrictRedis(host='localhost', port=6379, db=0) task1.py
import settings  connection = settings.get_redis_connection()  def do_something1():     return connection.hgetall(...) task2.py
import settings  connection = settings.get_redis_connection()  def do_something1():     return connection.hgetall(...) etc.
Basically I have a setting.py file that returns redis connections, and several different task files that get the redis connections, and then run operations. So each task file has its own redis instance (which presumably is very expensive). What's the best way of optimizing this process. Is it possible to use connection pools for this example? Is there a more efficient way of setting up this pattern?
For our system, we have over a dozen task files following this same pattern, and I've noticed our requests slowing down.
Thanks
Redis-py provides a connection pool for you from which you can retrieve a connection. Connection pools create a set of connections which you can use as needed (and when done - the connection is returned to the connection pool for further reuse).
Connection pooling is great for scalability - if you have 100 threads/clients/end-users, each of which need to talk to the database, you don't want them all to have a dedicated connection open to the database (connections are expensive resources), but rather to share connections (via pooling).
Connection pool size To improve performance, go-redis automatically manages a pool of network connections (sockets). By default, the pool size is 10 connections per every available CPU as reported by runtime.
Redis-py provides a connection pool for you from which you can retrieve a connection. Connection pools create a set of connections which you can use as needed (and when done - the connection is returned to the connection pool for further reuse). Trying to create connections on the fly without discarding them (i.e. not using a pool or not using the pool correctly) will leave you with way too many connections to redis (until you hit the connection limit).
You could choose to setup the connection pool in the init method and make the pool global (you can look at other options if uncomfortable with global).
redis_pool = None  def init():     global redis_pool     print("PID %d: initializing redis pool..." % os.getpid())     redis_pool = redis.ConnectionPool(host='10.0.0.1', port=6379, db=0) You can then retrieve the connection from a pool like this:
redis_conn = redis.Redis(connection_pool=redis_pool) Also, I am assuming you are using hiredis along with redis-py as it should improve performance in certain cases. Have you also checked the number of connections open to the redis server with your existing setup as it most likely is quite high? You can use the INFO commmand to get that information:
redis-cli info Check for the Clients section in which you will see the "connected_clients" field that will tell you how many connections you have open to the redis server at that instant.
You shall use a singleton( borg pattern ) based wrapper written over redis-py, which will provide a common connection pool to all your files. Whenever you use an object of this wrapper class, it will use the same connection pool.
REDIS_SERVER_CONF = {     'servers' : {       'main_server': {         'HOST' : 'X.X.X.X',         'PORT' : 6379 ,         'DATABASE':0     }   } }  import redis class RedisWrapper(object):     shared_state = {}      def __init__(self):         self.__dict__ = self.shared_state      def redis_connect(self, server_key):         redis_server_conf = settings.REDIS_SERVER_CONF['servers'][server_key]         connection_pool = redis.ConnectionPool(host=redis_server_conf['HOST'], port=redis_server_conf['PORT'],                                                db=redis_server_conf['DATABASE'])         return redis.StrictRedis(connection_pool=connection_pool) Usage:
r_server = RedisWrapper().redis_connect(server_key='main_server') r_server.ping() UPDATE
In case your files run as different processes, you will have to use a redis proxy which will pool the connections for you, and instead of connecting to redis directly, you will have to use the proxy. A very stable redis ( and memcached ) proxy is twemproxy created by twitter, with main purpose being reduction in open connections.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With