Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python requests is slow

I am developing a download manager. Using the requests module in python to check for a valid link (and hopefully broken links). My code for checking link below:

url = 'http://pyscripter.googlecode.com/files/PyScripter-v2.5.3-Setup.exe'
r = requests.get(url, allow_redirects=False) # this line takes 40 seconds
if r.status_code==200:
    print("link valid")
else:
    print("link invalid")

Now, the issue is this takes approximately 40 seconds to perform this check, which is huge. My question is how can I speed this up maybe using urllib2 or something??

Note: Also if I replace url with the actual URL which is 'http://pyscripter.googlecode.com/files/PyScripter-v2.5.3-Setup.exe', this takes one second so it appears to be an issue with requests.

like image 220
scandalous Avatar asked Apr 03 '13 06:04

scandalous


People also ask

Is Python request slow?

Python requests is slow and takes very long to complete HTTP or HTTPS request - Stack Overflow. Stack Overflow for Teams – Start collaborating and sharing organizational knowledge.

Is Python requests get blocking?

The reason why request might be blocked is that, for example in Python requests library, default user-agent is python-requests and websites understands that it's a bot and might block a request in order to protect the website from overload, if there's a lot of requests being sent.

Are Python requests secure?

Requests is the only Non-GMO HTTP library for Python, safe for human consumption. Warning: Recreational use of other HTTP libraries may result in dangerous side-effects, including: security vulnerabilities, verbose code, reinventing the wheel, constantly reading documentation, depression, headaches, or even death.

How does requests work in Python?

What can Requests do? Requests will allow you to send HTTP/1.1 requests using Python. With it, you can add content like headers, form data, multipart files, and parameters via simple Python libraries. It also allows you to access the response data of Python in the same way.


2 Answers

Not all hosts support head requests. You can use this instead:

r = requests.get(url, stream=True)

This actually only download the headers, not the response content. Moreover, if the idea is to get the file afterwards, you don't have to make another request.

See here for more infos.

like image 163
michaelmeyer Avatar answered Sep 30 '22 17:09

michaelmeyer


Don't use get that actually retrieves the file, use:

r = requests.head(url,allow_redirects=False)

Which goes from 6.9secs on my machine to 0.4secs

like image 39
Jon Clements Avatar answered Sep 30 '22 17:09

Jon Clements