I've implemented a retry mechanism to requests
session using urllib3.util.retry
as suggested both here and here.
Now, I am trying to figure out what is the best way to add a callback function that will be called on every retry attempt.
To explain myself even more, if either the Retry
object or the requests get
method had a way to add a callback function, it would have been great. Maybe something like:
import requests
from requests.packages.urllib3.util.retry import Retry
from requests.adapters import HTTPAdapter
def retry_callback(url):
print url
s = requests.Session()
retries = Retry(total=5, status_forcelist=[ 500, 502, 503, 504 ])
s.mount('http://', HTTPAdapter(max_retries=retries))
url = 'http://httpstat.us/500'
s.get(url, callback=retry_callback, callback_params=[url])
I know that for printing url I can use the logging, but this is only a simple example for a more complex use.
You can subclass the Retry
class to add that functionality.
This is the full interaction flow with the Retry
instance for a given connection attempt:
Retry.increment()
is called with the current method, url, response object (if there is one), and exception (if one was raised) whenever an exception is raised, or a 30x redirection response was returned, or the Retry.is_retry()
method returns true.
.increment()
will re-raise the error (if there was one) and the object was configured not to retry that specific class of errors..increment()
calls Retry.new()
to create an updated instance, with any relevant counters updated and the history
attribute amended with a new RequestHistory()
instance (a named tuple)..increment()
will raise a MaxRetryError
exception if Retry.is_exhausted()
called on the return value of Retry.new()
is true. is_exhausted()
returns true when any of the counters it tracks has dropped below 0 (counters set to None
are ignored)..increment()
returns the new Retry
instance.Retry.increment()
replaces the old Retry
instance tracked. If there was a redirect, then Retry.sleep_for_retry()
is called (sleeping if there was a Retry-After
header), otherwise Retry.sleep()
is called (which calls self.sleep_for_retry()
to honor a Retry-After
header, otherwise just sleeping if there is a back-off policy). Then a recursive connection call is made with the new Retry
instance.This gives you 3 good callback points; at the start of .increment()
, when creating the new Retry
instance, and in a context manager around super().increment()
to let a callback veto an exception or update the returned retry policy on exit.
This is what putting a hook on the start of .increment()
would look like:
import logging
logger = getLogger(__name__)
class CallbackRetry(Retry):
def __init__(self, *args, **kwargs):
self._callback = kwargs.pop('callback', None)
super(CallbackRetry, self).__init__(*args, **kwargs)
def new(self, **kw):
# pass along the subclass additional information when creating
# a new instance.
kw['callback'] = self._callback
return super(CallbackRetry, self).new(**kw)
def increment(self, method, url, *args, **kwargs):
if self._callback:
try:
self._callback(url)
except Exception:
logger.exception('Callback raised an exception, ignoring')
return super(CallbackRetry, self).increment(method, url, *args, **kwargs)
Note, the url
argument is really only the URL path, the net location portion of the request is omitted (you'd have to extract that from the _pool
argument, it has .scheme
, .host
and .port
attributes).
Demo:
>>> def retry_callback(url):
... print('Callback invoked with', url)
...
>>> s = requests.Session()
>>> retries = CallbackRetry(total=5, status_forcelist=[500, 502, 503, 504], callback=retry_callback)
>>> s.mount('http://', HTTPAdapter(max_retries=retries))
>>> s.get('http://httpstat.us/500')
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Callback invoked with /500
Traceback (most recent call last):
File "/.../lib/python3.6/site-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 732, in urlopen
body_pos=body_pos, **response_kw)
[Previous line repeated 1 more times]
File "/.../lib/python3.6/site-packages/urllib3/connectionpool.py", line 712, in urlopen
retries = retries.increment(method, url, response=response, _pool=self)
File "<stdin>", line 8, in increment
File "/.../lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 521, in get
return self.request('GET', url, **kwargs)
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "/.../lib/python3.6/site-packages/requests/sessions.py", line 618, in send
r = adapter.send(request, **kwargs)
File "/.../lib/python3.6/site-packages/requests/adapters.py", line 499, in send
raise RetryError(e, request=request)
requests.exceptions.RetryError: HTTPConnectionPool(host='httpstat.us', port=80): Max retries exceeded with url: /500 (Caused by ResponseError('too many 500 error responses',))
Putting a hook in the .new()
method would let you adjust the policy for a next attempt, as well as let you introspect the .history
attribute, but would not let you avoid the exception re-raising.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With