Running Python on a Django project which communicates with various web-services, we have an issue that occasionally requests are taking around 5 seconds instead of their usual < 100 ms.
I've narrowed this down to time taken in the socket.getaddrinfo
function - this is being called by requests
when we connect to external services, but it also appears to effect the default Django connection to the Postgres database box in the cluster. When we restart uwsgi
after a deployment the first requests that come in will take 5 seconds to send a response. I also believe that our celery tasks are taking 5 seconds on a regular basis, but I've not added statsd timer tracking to them yet.
I've written some code to reproduce the issue:
import socket
import timeit
def single_dns_lookup():
start = timeit.default_timer()
socket.getaddrinfo('stackoverflow.com', 443)
end = timeit.default_timer()
return int(end - start)
timings = {}
for _ in range(0, 10000):
time = single_dns_lookup()
try:
timings[time] += 1
except KeyError:
timings[time] = 1
print timings
Typical results are {0: 9921, 5: 79}
My colleague has already pointed to potential issues around ipv6 lookup times and has added this to the /etc/gai.conf
:
precedence ::ffff:0:0/96 100
This has definitely improved lookups from non-Python programs such as curl
which we use, but not from Python itself. The server boxes are running Ubuntu 16.04.3 LTS and I'm able to reproduce this on a vanilla VM with Python 2.
What steps can I take to improve the performance of all Python lookups so that they can take < 1s?
5s is a default timeout to DNS lookup.
You can lower that.
Your real problem is probably (silent) UDP packets drop on the network though.
Edit: Experiment with resolution over TCP. Never done that. Might help you.
There are two things that can be done. One is that you don't query the IPV6 address, this can be done by monkey patching getaddrinfo
orig_getaddrinfo = socket.getaddrinfo
def _getaddrinfo(host, port, family=0, type=0, proto=0, flags=0):
return orig_getaddrinfo(host, port, socket.AF_INET, type, proto, flags)
socket.getaddrinfo = _getaddrinfo
Next you can also use a ttl based cache to cache the result. You can use cachepy
package for the same.
from cachetools import cached
import socket
import timeit
from cachepy import *
# or from cachepy import Cache
cache_with_ttl = Cache(ttl=600) # ttl given in seconds
orig_getaddrinfo = socket.getaddrinfo
# @cached(cache={})
@cache_with_ttl
def _getaddrinfo(host, port, family=0, type=0, proto=0, flags=0):
return orig_getaddrinfo(host, port, socket.AF_INET, type, proto, flags)
socket.getaddrinfo = _getaddrinfo
def single_dns_lookup():
start = timeit.default_timer()
socket.getaddrinfo('stackoverflow.com', 443)
end = timeit.default_timer()
return int(end - start)
timings = {}
for _ in range(0, 10000):
time = single_dns_lookup()
try:
timings[time] += 1
except KeyError:
timings[time] = 1
print (timings)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With