Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do i truncate url using python [duplicate]

Tags:

python

How do i truncate the below URL next to the domain "com" using python. i.e you tube.com only

    youtube.com/video/AiL6nL
    yahoo.com/video/Hhj9B2
    youtube.com/video/MpVHQ
    google.com/video/PGuTN
    youtube.com/video/VU34MI

Is it possible to truncate like this?

like image 715
Brisi Avatar asked Sep 15 '25 12:09

Brisi


1 Answers

Check out Pythons urlparse library. It is a standard library so nothing else needs to be installed.

So you could do the following:

import urlparse
import re

def check_and_add_http(url):
    # checks if 'http://' is present at the start of the URL and adds it if not.
    http_regex = re.compile(r'^http[s]?://')
    if http_regex.match(url):
        # 'http://' or 'https://' is present
        return url
    else:
        # add 'http://' for urlparse to work.
        return 'http://' + url

for url in url_list:
    url = check_and_add_http(url)
    print(urlparse.urlsplit(url)[1])

You can read more about urlsplit() in the documentation, including the indexes if you want to read the other parts of the URL.

like image 156
Ewan Avatar answered Sep 18 '25 10:09

Ewan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!