Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

104, 'Connection reset by peer' socket error, or When does closing a socket result in a RST rather than FIN?

We're developing a Python web service and a client web site in parallel. When we make an HTTP request from the client to the service, one call consistently raises a socket.error in socket.py, in read:

(104, 'Connection reset by peer')

When I listen in with wireshark, the "good" and "bad" responses look very similar:

  • Because of the size of the OAuth header, the request is split into two packets. The service responds to both with ACK
  • The service sends the response, one packet per header (HTTP/1.0 200 OK, then the Date header, etc.). The client responds to each with ACK.
  • (Good request) the server sends a FIN, ACK. The client responds with a FIN, ACK. The server responds ACK.
  • (Bad request) the server sends a RST, ACK, the client doesn't send a TCP response, the socket.error is raised on the client side.

Both the web service and the client are running on a Gentoo Linux x86-64 box running glibc-2.6.1. We're using Python 2.5.2 inside the same virtual_env.

The client is a Django 1.0.2 app that is calling httplib2 0.4.0 to make requests. We're signing requests with the OAuth signing algorithm, with the OAuth token always set to an empty string.

The service is running Werkzeug 0.3.1, which is using Python's wsgiref.simple_server. I ran the WSGI app through wsgiref.validator with no issues.

It seems like this should be easy to debug, but when I trace through a good request on the service side, it looks just like the bad request, in the socket._socketobject.close() function, turning delegate methods into dummy methods. When the send or sendto (can't remember which) method is switched off, the FIN or RST is sent, and the client starts processing.

"Connection reset by peer" seems to place blame on the service, but I don't trust httplib2 either. Can the client be at fault?

** Further debugging - Looks like server on Linux **

I have a MacBook, so I tried running the service on one and the client website on the other. The Linux client calls the OS X server without the bug (FIN ACK). The OS X client calls the Linux service with the bug (RST ACK, and a (54, 'Connection reset by peer')). So, it looks like it's the service running on Linux. Is it x86_64? A bad glibc? wsgiref? Still looking...

** Further testing - wsgiref looks flaky **

We've gone to production with Apache and mod_wsgi, and the connection resets have gone away. See my answer below, but my advice is to log the connection reset and retry. This will let your server run OK in development mode, and solidly in production.

like image 322
jwhitlock Avatar asked Dec 20 '08 21:12

jwhitlock


People also ask

What is error 104 reset by peer?

It means that TCP reset has been sent to your computer. This happens for example when web server is restarted due to configuration change. Usually you solve this problem by reloading web page or waiting few minutes for maintanance window to close.

What causes Connection reset by peer errors?

The error message "Connection reset by peer" appears, if the web services client was waiting for a SOAP response from the remote web services provider and the connection was closed prematurely. One of the most common causes for this error is a firewall in the middle closing the connection.

What does Connection reset by peer mean Python?

An application gets a connection reset by peer error when it has an established TCP connection with a peer across the network, and that peer unexpectedly closes the connection on the far end.


1 Answers

I've had this problem. See The Python "Connection Reset By Peer" Problem.

You have (most likely) run afoul of small timing issues based on the Python Global Interpreter Lock.

You can (sometimes) correct this with a time.sleep(0.01) placed strategically.

"Where?" you ask. Beats me. The idea is to provide some better thread concurrency in and around the client requests. Try putting it just before you make the request so that the GIL is reset and the Python interpreter can clear out any pending threads.

like image 187
S.Lott Avatar answered Oct 19 '22 16:10

S.Lott