Using Beautiful Soup to get the full URL in source code

Q: How do you get a URL on BeautifulSoup?

Use the a tag to extract the links from the BeautifulSoup object. Get the actual URLs from the form all anchor tag objects with get() method and passing href argument to it. Moreover, you can get the title of the URLs with get() method and passing title argument to it.

Q: Which method in BeautifulSoup is used to check all URL or images?

Method 1: Using descendants and find() First, import the required modules, then provide the URL and create its requests object that will be parsed by the beautifulsoup object. Now with the help of find() function in beautifulsoup we will find the <body> and its corresponding <ul> tags.

Tags:

python

So I was looking at some source code and I came across this bit of code

<img src="/gallery/2012-winners-finalists/HM_Watching%20birds2_Shane%20Conklin_MA_2012.jpg"

now in the source code the link is blue and when you click it, it takes you to the full URL where that picture is located, I know how to get what is shown in the source code in Python using Beautiful Soup I was wondering though how to get the full URL you get once clicking the link in the source code?

EDIT: if I was given <a href = "/folder/big/a.jpg" how do you figure out the starting part of that url through python or beautiful soup?

329

asked Jul 31 '13 13:07

user2476540

1 Answers

<a href="/folder/big/a.jpg">

That’s an absolute address for the current host. So if the HTML file is at http://example.com/foo/bar.html, then applying the url /folder/big/a.jpg will result in this:

http://example.com/folder/big/a.jpg

I.e. take the host name and apply the new path to it.

Python has the builtin urljoin function to perform this operation for you:

>>> from urllib.parse import urljoin
>>> base = 'http://example.com/foo/bar.html'
>>> href = '/folder/big/a.jpg'
>>> urljoin(base, href)
'http://example.com/folder/big/a.jpg'

For Python 2, the function is within the urlparse module.

answered Oct 20 '22 15:10

poke

Related questions
                            
                                python listing dirs in a different order based upon platform
                            
                                How to convert a string list into an integer in python [duplicate]
                            
                                Getting HTML with Pycurl
                            
                                Python 3.2 - readline() is skipping lines in source file
                            
                                Call super().__init__() in classes derived from `object`?
                            
                                change some lowercase letters to uppercase in string
                            
                                How to fill a list
                            
                                Python - Access object attributes as in a dictionary
                            
                                Leave arguments untouched with argparse
                            
                                3d rotation on image
                            
                                most efficent way of finding the minimum float in a python list
                            
                                Concatenate or print list elements with a trailing comma in Python
                            
                                Django - catch exception
                            
                                Ackermann Function Understanding
                            
                                reorder byte order in hex string (python)
                            
                                Removing key values pairs from a list of dictionaries
                            
                                Python ftplib connection error (gaierror)
                            
                                PythonMagick can't find my pdf files
                            
                                Python - Descriptor 'split' requires a 'str' object but received a 'unicode'
                            
                                Convert unicode with utf-8 string as content to str

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With