Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I scrape image-src in beautifulsoup

I am trying to get image-src in this code:

<img alt='Original Xiaomi Redmi Note 5 4GB RAM 64GB ROM Snapdragon S636 Octa Core Mobile Phone MIUI9 5.99" 2160*1080 4000mAh 12.0+5.0MP(China)' class="picCore" id="limage_32856997152" image-src="//ae01.alicdn.com/kf/HTB1WDJZbE_rK1Rjy0Fcq6zEvVXaS/Original-Xiaomi-Redmi-Note-5-4GB-RAM-64GB-ROM-Snapdragon-S636-Octa-Core-Mobile-Phone-MIUI9.jpg_220x220xz.jpg" itemprop="image"/>

I tried this code but it is not working:

images = soup.find('img').get('image-src')

Usually I use get('src') and it works but the problem is here: I need to use image-src which does not work.

like image 545
Muhammad Faisal Avatar asked Mar 27 '19 19:03

Muhammad Faisal


People also ask

Can you web scrape images?

Downloading lots of images from a website can be quite time-consuming. Right-click, Save Image As…, repeat ad nauseam. In these cases, web scraping is the solution to your problem. In this tutorial, we will go over how to extract the URL for every image on a webpage using a free web scraper.


1 Answers

Looking at this documentation, I found the find_all method which works for this case:

This worked for me:

for link in soup.find_all('img'):
    print(link.get('image-src'))

Here was my full code:

from bs4 import BeautifulSoup

html_doc = """
<img alt='Original Xiaomi Redmi Note 5 4GB RAM 64GB ROM Snapdragon S636 Octa Core Mobile Phone MIUI9 5.99" 2160*1080 4000mAh 12.0+5.0MP(China)' class="picCore" id="limage_32856997152" image-src="//ae01.alicdn.com/kf/HTB1WDJZbE_rK1Rjy0Fcq6zEvVXaS/Original-Xiaomi-Redmi-Note-5-4GB-RAM-64GB-ROM-Snapdragon-S636-Octa-Core-Mobile-Phone-MIUI9.jpg_220x220xz.jpg" itemprop="image"/>
"""

soup = BeautifulSoup(html_doc, 'html.parser')

for link in soup.find_all('img'):
    print(link.get('image-src'))

and the result:

//ae01.alicdn.com/kf/HTB1WDJZbE_rK1Rjy0Fcq6zEvVXaS/Original-Xiaomi-Redmi-Note-5-4GB-RAM-64GB-ROM-Snapdragon-S636-Octa-Core-Mobile-Phone-MIUI9.jpg_220x220xz.jpg  
like image 116
Matthew Salvatore Viglione Avatar answered Oct 02 '22 13:10

Matthew Salvatore Viglione