Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyMongo raises [errno 49] can't assign requested address after a large number of queries

I have a MongoDB collection with > 1,000,000 documents. I am performing an initial .find({ my_query }) to return a subset of those documents (~25,000 documents), which I then put into a list object.

I am then looping over each of the objects, parsing some values from the returned document in the list, and performing an additional query using those parsed values via the code:

def _perform_queries(query):
    conn = pymongo.MongoClient('mongodb://localhost:27017')
    try:
        coll = conn.databases['race_results']
        races = coll.find(query).sort("date", -1)
    except BaseException, err:
        print('An error occured in runner query: %s\n' % err)
    finally:
        conn.close()
        return races

In this case, my query dictionary is:

{"$and": [{"opponents":
    {"$elemMatch": {"$and": [
        {"runner.name": name},
        {"runner.jockey": jockey}
    ]}}},
    {"summary.dist": "1"}
]}

Here is my issue. I have created an index on opponents.runner.name and opponents.runner.jockey. This makes the queries really-really fast. However, after about 10,000 queries in a row, pymongo is raising an exception:

pymongo.errors.AutoReconnect: [Errno 49] Can't assign requested address

When I remove the index, I don't see this error. But it takes about 0.5 seconds per query, which is unusable in my case.

Does anyone know why the [Errno 49] can't assign requested address could be occurring? I've seen a few other SO questions related to can't assign requested address but not in relation to pymongo and there answers don't lead me anywhere.

UPDATE:

Following Serge's advice below, here is the output of ulimit -a:

core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 2560
pipe size            (512 bytes, -p) 1
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 709
virtual memory          (kbytes, -v) unlimited

My MongoDB is running on OS X Yosemite.

like image 554
Brett Avatar asked Mar 23 '15 18:03

Brett


1 Answers

This is because you are using PyMongo incorrectly. You are creating a new MongoClient for each query, which requires you to open a new socket for each new query. This defeats PyMongo's connection pooling, and besides being extremely slow, it also means you open and close sockets faster than your TCP stack can keep up: you leave too many sockets in TIME_WAIT state so you eventually run out of ports.

Luckily, the fix is simple. Create one MongoClient and use it throughout:

conn = pymongo.MongoClient('mongodb://localhost:27017')
coll = conn.databases['race_results']

def _perform_queries(query):
    return coll.find(query).sort("date", -1)
like image 154
A. Jesse Jiryu Davis Avatar answered Nov 10 '22 15:11

A. Jesse Jiryu Davis