Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intermittent issues while accessing external http services using gevent

First off, the versions:

  • gevent - v0.13.7
  • gunicorn - v0.14.2
  • requests - 0.11.2

We recently upgraded our servers that are running behind gunicorn to use the gevent asynchronous workers instead of just normal sync workers. Everything works great, but we're now experiencing an issue when attempting to access a 3rd party service over http and I just have no idea how to track down what might be the issue.

A brief stack trace looks like the following:

File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/sessions.py", line 295, in post
  return self.request('post', url, data=data, **kwargs)
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/sessions.py", line 252, in request
  r.send(prefetch=prefetch)
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/models.py", line 625, in send
  raise ConnectionError(sockerr)
ConnectionError: [Errno 66] unknown

Another different stack trace but we think it's the same issue:

File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 94, in connect
  sock = socket.create_connection((self.host, self.port), self.timeout)
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/gevent/socket.py", line 637, in create_connection
  for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/gevent/socket.py", line 769, in getaddrinfo
  raise
DNSError: [Errno 66] unknown

At first, I thought it could be potentially something related to a libevent-dns, from this google groups issue. I checked our /etc/resolv.conf, and there is only one dns resolution service:

[me@host:~]$ cat /etc/resolv.conf
; generated by /sbin/dhclient-script
nameserver 10.3.0.2

I looked up what ERRNO66 is: https://github.com/libevent/libevent/blob/master/include/event2/dns.h#L162,"/** An unknown error occurred */". I'm not having much luck finding that helpful..sounds like it couldn't talk to the dns server?

I thought it might have to do something with python-requests, see how enable requests async mode? since python-requests depends on urllib3, which is implemented in terms of httplib; but, it turns out the author of gevent removed the httplib patch in this commit earlier this year without any comments as to why.

Does anyone have any ideas on how to approach debugging this issue or might shed some light on what's happening here?

Thanks in advance!

Update - 12:50PM PDT

After some conversations on freenode, the #gevent and the #gunicorn channel seem to shed some more insight:

#gevent

  • gevent v0.13.7 still supports the patch_all with httplib=True
  • I asked if "it make sense to patch it?", the response was no.
  • Recommendation to use gevent 1.0 (even if it's beta).
  • quote from @schmir:

    "patch httplib uses libevent http client library. I don't trust libevent. my advice would have been to turn it off, if you used it"

#gunicorn

  • <Damianz> What's your platform? I've seen that issue appear on windows boxes where it tries ipv6 and just fails life.. (I'm on CentOS 5)
  • <dmishe> I've seen similar on mac, looks like gevent beta fixed it

Sounds like the general advice is to ditch gevent v0.13.7 and upgrade to gevent 1.0b.

I'll follow up on if that fixes this issue. Meanwhile, anyone that can shed advice, I'd much appreciate it.

Update #2 - 4 days in production, 1:15PM PDT

Looks like the upgrade to gevent has solved this issue -- I'll add my answer and accept it if no one else chimes in, but only after a week without incidents in production.

like image 204
Mahmoud Abdelkader Avatar asked Sep 13 '12 18:09

Mahmoud Abdelkader


1 Answers

Upgrading to gevent 1.0b has eliminated the issue.

like image 125
Mahmoud Abdelkader Avatar answered Nov 16 '22 02:11

Mahmoud Abdelkader