Extract src attribute from img tag using BeautifulSoup

Tags:

<div class="someClass">     <a href="href">         <img alt="some" src="some"/>     </a> </div>

I want to extract the source (i.e. src) attribute from an image (i.e. img) tag using BeautifulSoup. I use bs4 and I cannot use a.attrs['src'] to get the src, but I can get href. What should I do?

817

asked May 15 '17 14:05

iDelusion

1 Answers

You can use BeautifulSoup to extract src attribute of an html img tag. In my example, the htmlText contains the img tag itself but this can be used for a URL too along with urllib2.

For URLs

from BeautifulSoup import BeautifulSoup as BSHTML import urllib2 page = urllib2.urlopen('http://www.youtube.com/') soup = BSHTML(page) images = soup.findAll('img') for image in images:     #print image source     print image['src']     #print alternate text     print image['alt']

For Texts with img tag

from BeautifulSoup import BeautifulSoup as BSHTML htmlText = """<img src="https://src1.com/" <img src="https://src2.com/" /> """ soup = BSHTML(htmlText) images = soup.findAll('img') for image in images:     print image['src']

Python 3 : Updated on 2022-02-02

from bs4 import BeautifulSoup as BSHTML import urllib  page = urllib.request.urlopen('https://github.com/abushoeb/emotag') soup = BSHTML(page) images = soup.findAll('img')  for image in images:     #print image source     print(image['src'])     #print alternate text     print(image['alt'])

Install modules if needed

# python 3 pip install beautifulsoup4 pip install urllib3

answered Sep 30 '22 02:09

Abu Shoeb

Related questions
                            
                                How to replace unicode characters in string with something else python?
                            
                                Python OrderedDict not keeping element order [duplicate]
                            
                                List available tests with py.test
                            
                                Tensorflow: How to get a tensor by name?
                            
                                How to predict input image using trained model in Keras?
                            
                                How to overload Python's __bool__ method? [duplicate]
                            
                                Change default Python version from 2.4 to 2.6
                            
                                conda command will prompt error: "Bad Interpreter: No such file or directory"
                            
                                Python round to next highest power of 10
                            
                                Python Dependency Injection Framework
                            
                                Python lambda with if but without else
                            
                                In Pandas how do I convert a string of date strings to datetime objects and put them in a DataFrame?
                            
                                How do I install Jupyter notebook on an Android device?
                            
                                Why is this regular expression so slow in Java? [duplicate]
                            
                                How do I extend the Django Group model?
                            
                                How do I ignore PyCharm configuration files in a git repository?
                            
                                how to do a left,right and mid of a string in a pandas dataframe
                            
                                Actions triggered by field change in Django
                            
                                How to run a python file using cron jobs
                            
                                In OpenCV (Python), why am I getting 3 channel images from a grayscale image?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Extract src attribute from img tag using BeautifulSoup

Tags:

python

regex

beautifulsoup

iDelusion

People also ask

1 Answers

Abu Shoeb

Recent Activity

Donate For Us