Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boto S3 throws httplib.IncompleteRead occasionally

I have several daemons that read many files from Amazon S3 using boto. Once every couple of days, I'm running into a situation where an httplib.IncompleteRead is thrown out from deep inside boto. If I try and retry the request, it immediately fails with another IncompleteRead. Even if I call bucket.connection.close(), all further requests will still error out.

I feel like I might've stumbled across a bug in boto here, but nobody else seems to have hit it. Am I doing something wrong? All of the daemons are single-threaded, and I've tried setting is_secure both ways.

Traceback (most recent call last):
  ...
  File "<file_wrapper.py",> line 22, in next
    line = self.readline()
  File "<file_wrapper.py",> line 37, in readline
    data = self.fh.read(self.buffer_size)
  File "<virtualenv/lib/python2.6/site-packages/boto/s3/key.py",> line 378, in read
    self.close()
  File "<virtualenv/lib/python2.6/site-packages/boto/s3/key.py",> line 349, in close
    self.resp.read()
  File "<virtualenv/lib/python2.6/site-packages/boto/connection.py",> line 411, in read
    self._cached_response = httplib.HTTPResponse.read(self)
  File "/usr/lib/python2.6/httplib.py", line 529, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python2.6/httplib.py", line 621, in _safe_read
    raise IncompleteRead(''.join(s), amt)

Environment:

  • Amazon EC2
  • Ubuntu 11.10
  • Python 2.6.7
  • Boto 2.12.0
like image 484
wolak Avatar asked Oct 15 '13 04:10

wolak


2 Answers

I've been struggling with this problem for a while, running long-running processes which read large amount of data from S3. I decided to post my solution here, for posterity.

First of all, I'm sure the hack pointed to by @Glenn works, but I chose not to use it because I consider it intrusive (hacking httplib) and unsafe (it blindly returns what it got, i.e. return e.partial, despite the fact it can be real-error-case).

Here is the solution I finally came up with, which seems to be working.

I'm using this general-purpose retrying function:

import time, logging, httplib, socket

def run_with_retries(func, num_retries, sleep = None, exception_types = Exception, on_retry = None):
    for i in range(num_retries):
        try:
            return func()  # call the function
        except exception_types, e:
            # failed on the known exception
            if i == num_retries - 1:
                raise  # this was the last attempt. reraise
            logging.warning(f'operation {func} failed with error {e}. will retry {num_retries-i-1} more times')
            if on_retry is not None:
                on_retry()
            if sleep is not None:
                time.sleep(sleep)
    assert 0  # should not reach this point

Now, when reading a file from S3, I'm using this function, which internally performs retries in case of IncompleteRead errors. Upon an error, before retrying, I call key.close().

def read_s3_file(key):
    """
    Reads the entire contents of a file on S3.
    @param key: a boto.s3.key.Key instance
    """
    return run_with_retries(
        key.read, num_retries = 3, sleep = 0.5,
        exception_types = (httplib.IncompleteRead, socket.error),
        # close the connection before retrying
        on_retry = lambda: key.close()
    )
like image 74
shx2 Avatar answered Oct 07 '22 21:10

shx2


It may well be a bug in boto, but the symptoms you describe are not unique to it. See

IncompleteRead using httplib

https://dev.twitter.com/discussions/9554

Since httplib appears in your traceback, one solution is proposed here:

http://bobrochel.blogspot.in/2010/11/bad-servers-chunked-encoding-and.html?showComment=1358777800048

Disclaimer: I have no experience with boto. This is based on research only and posted since there have been no other responses.

like image 4
Glenn Avatar answered Oct 07 '22 21:10

Glenn