Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Display text from img alt tag with beautifulsoup

So far my code is:

year = range(1958,2013)
randomYear = random.choice(year)
randomYear = str(randomYear)
page = range(1,5)
randomPage = random.choice(page)
randomPage = str(randomPage)
print(randomPage, randomYear)
url = 'http://www.billboard.com/artists/top-100/'+randomYear+'?page='+randomPage
url1 = urlopen(url)
htmlSource = url1.read()
url1.close()
soup = BeautifulSoup(htmlSource)
listm = soup.findAll('article', {'class': 'masonry-brick','style' : 'position;  absolute; top; 0px; left: 0px;'})
for listm in soup.findAll('div',{'class': 'thumbnail'}):
    for listm in soup.find('img alt')(''):
        print(listm)

What I want to do is get the img alt='' text. I think I have it correct, somewhat but it displays nothing.

like image 363
Brian Fuller Avatar asked Dec 18 '13 03:12

Brian Fuller


People also ask

How do you get img alt in Beautifulsoup?

1 Answer. You need to use the find_all() method with the parameter "alt=True" to get the alt text.

What is IMG alt text?

Also called alt tags and alt descriptions, alt text is the written copy that appears in place of an image on a webpage if the image fails to load on a user's screen. This text helps screen-reading tools describe images to visually impaired readers and allows search engines to better crawl and rank your website.

How do I find the alt text of an image on a website?

Chrome™ browser: point to the image with your mouse, right-click and choose Inspect from the quick menu (or use Ctrl-Shift-I on keyboard). A new pane will open at the right of your screen with the HTML code highlighted for that element. You can then view the alt text and other attributes for the image.


1 Answers

To get <img> elements that have alt attribute, you could use soup('img', alt=True):

print("\n".join([img['alt'] for img in div.find_all('img', alt=True)]))

Do not use the same name for different purposes, it hurts readability of the code:

soup = BeautifulSoup(htmlSource)
articles = soup('article', 'masonry-brick',
                style='position;  absolute; top; 0px; left: 0px;')
for div in soup.find_all('div', 'thumbnail'):
    for img in div.find_all('img', alt=True):
        print(img['alt'])

Note: articles is unused.

I only need one img tag. How can I do this?

You could use .find() method, to get one <img> element per <div>:

for div in soup.find_all('div', 'thumbnail'):
    img = div.find('img', alt=True)
    print(img['alt'])
like image 62
jfs Avatar answered Oct 30 '22 11:10

jfs