Suppose i have links as follows:
http://example.com/index.html
http://example.com/stack.zip
http://example.com/setup.exe
http://example.com/news/
In the above links first and fourth links are web page links and second and third are the file link.
These are only some examples of files links i.e .zip and .exe, but there may be many other files.
Is there any standard way to distinguish between file url or web page link? Thanks in advance.
import urllib
import mimetypes
def guess_type_of(link, strict=True):
link_type, _ = mimetypes.guess_type(link)
if link_type is None and strict:
u = urllib.urlopen(link)
link_type = u.headers.gettype() # or using: u.info().gettype()
return link_type
Demo:
links = ['http://stackoverflow.com/q/21515098/538284', # It's a html page
'http://upload.wikimedia.org/wikipedia/meta/6/6d/Wikipedia_wordmark_1x.png', # It's a png file
'http://commons.wikimedia.org/wiki/File:Typing_example.ogv', # It's a html page
'http://upload.wikimedia.org/wikipedia/commons/e/e6/Typing_example.ogv' # It's an ogv file
]
for link in links:
print(guess_type_of(link))
Output:
text/html
image/x-png
text/html
application/ogg
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With