So what I am trying to do is convert a HEADER requests content-type into a file extension. The typical content-type is like this for html pages "text/html; charset=utf-8" that is the given response from python. I have looked into using the mimetype module with no success as it doesn't look like it accommodates what I am looking for.
Rundown:
I want to convert "text/html; charset=utf-8" into this ".html"
The typical image content-type is "image/jpeg" depending on the image type, but I am not too worried about images, given that most urls specify the image in the path. This is more for websites that don't end in "blahahah.html"
I do not want to use any libraries that are not in the base python library.
You could split and strip:
r = requests.get("http://stackoverflow.com/questions/29674905/convert-content-type-header-into-file-extension")
from mimetypes import guess_extension
print(guess_extension(r.headers['content-type'].partition(';')[0].strip()))
.htm
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With