Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Split url into its components

I have a huge list of urls that are all like this:

http://www.example.com/site/section1/VAR1/VAR2

Where VAR1 and VAR2 are the dynamic elements of the url. What I want to do is to extract from this url string only the VAR1. I've tried to use urlparse but the output look like this:

ParseResult(scheme='http', netloc='www.example.com', path='/site/section1/VAR1/VAR2', params='', query='', fragment='')
like image 521
Hyperion Avatar asked Jul 01 '15 19:07

Hyperion


People also ask

How do you split a link in Python?

You can split the line by space. and then use the os module to get the filename from the path. For example. +1.

What is URL parse in Python?

Source code: Lib/urllib/parse.py. This module defines a standard interface to break Uniform Resource Locator (URL) strings up in components (addressing scheme, network location, path etc.), to combine the components back into a URL string, and to convert a “relative URL” to an absolute URL given a “base URL.”


1 Answers

You can remember this in general. Different sections of the url can be obtained using urlparse. Here you can obtain the path by urlparse(url).path and then obtain the desired variable by split() function

>>> from urlparse import urlparse
>>> url = 'http://www.example.com/site/section1/VAR1/VAR2' 
>>> urlparse(url)
ParseResult(scheme='http', netloc='www.example.com', path='/site/section1/VAR1/VAR2', params='', query='', fragment='')
>>> urlparse(url).path
'/site/section1/VAR1/VAR2'
>>> urlparse(url).path.split('/')[-2]
'VAR1'
like image 175
Naman Sogani Avatar answered Sep 19 '22 17:09

Naman Sogani