I want to "ping" a server, check the header response to see if the link is broken, and if it's not broken, actually download the response body.
Traditionally, using a sync method with the requests
module, you could send a get
request with the stream = True
parameter, and capture the headers before the response body download, deciding, in case of error (not found, for example), to abort the connection.
My problem is, doing this with the async libraries grequests
or requests-futures
has become impossible for my reduced knowdlege base.
I've tried setting the stream parameter to true in request-futures
but to no use, it still downloads the response body without letting me intervene as soon as it gets the response headers. And even if it did, I wouldn't be sure of how to proceed.
This is what I've tried:
from requests_futures.sessions import FuturesSession
session = FuturesSession()
session.stream = True
future = session.get('http://www.google.com')
response = future.result()
print(response.status_code) # Here I would assume the response body hasn't been loaded
Upon debugging I find it downloads the response body either way.
I would appreciate any solution to the initial problem, whether it follows my logic or not.
I believe what you want is an HTTP HEAD request:
session.head('http://www.google.com')
Per w3.org, "the HEAD method is identical to GET except that the server MUST NOT return a message-body in the response." If you like the status code and headers, you can follow-up with a normal GET request.
For the comments, it looks like you might also be interested in doing this in a single request. It is possible to do so directly with sockets. Send the normal GET request, do a recv of the first block, if you don't like the result, close the connection, otherwise loop over the remaining blocks.
Here is a proof of concept of how to download conditionally with a single request:
import socket
def fetch_on_header_condition(host, resource, condition, port=80):
request = 'GET %s HTTP/1.1\r\n' % resource
request += 'Host: %s\r\n' % host
request += 'Connection: close\r\n'
request += '\r\n'
s = socket.socket()
try:
s.connect((host, port))
s.send(request)
first_block = s.recv(4096)
if not condition(first_block):
return False, ''
blocks = [first_block]
while True:
block = s.recv(4096)
if not block:
break
blocks.append(block)
return True, ''.join(blocks)
finally:
s.close()
if __name__ == '__main__':
print fetch_on_header_condition(
host = 'www.jython.org',
port = 80,
resource = '/',
condition = lambda s: 'Content-Type: text/xml' in s,
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With