Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check the url is either web page link or file link in python

Suppose i have links as follows:

    http://example.com/index.html
    http://example.com/stack.zip
    http://example.com/setup.exe
    http://example.com/news/

In the above links first and fourth links are web page links and second and third are the file link.

These are only some examples of files links i.e .zip and .exe, but there may be many other files.

Is there any standard way to distinguish between file url or web page link? Thanks in advance.

like image 448
Bishwash Avatar asked Feb 14 '23 22:02

Bishwash


1 Answers

import urllib
import mimetypes


def guess_type_of(link, strict=True):
    link_type, _ = mimetypes.guess_type(link)
    if link_type is None and strict:
        u = urllib.urlopen(link)
        link_type = u.headers.gettype() # or using: u.info().gettype()
    return link_type

Demo:

links = ['http://stackoverflow.com/q/21515098/538284', # It's a html page
         'http://upload.wikimedia.org/wikipedia/meta/6/6d/Wikipedia_wordmark_1x.png', # It's a png file
         'http://commons.wikimedia.org/wiki/File:Typing_example.ogv', # It's a html page
         'http://upload.wikimedia.org/wikipedia/commons/e/e6/Typing_example.ogv'   # It's an ogv file
]

for link in links:
    print(guess_type_of(link))

Output:

text/html
image/x-png
text/html
application/ogg
like image 115
Omid Raha Avatar answered Apr 29 '23 18:04

Omid Raha