Looking at the requests documentation, I know that I can use response.content for binary content (such as a .jpg file) and response.text for a regular html page. However, when the source is an image, and I try to access r.text, the script hangs. How can I determine in advance if the response contains html?
I have considered checking the url for an image extension, but that does not seem fool-proof.
The content type should be a header. See this page in the documentation.
Example code:
r = requests.get(url)
if r.headers['content-type'] == 'text/html':
data = r.text
elif r.headers['content-type'] == 'application/ogg':
data = r.content
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With