Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup: how to select certain tag

I am confused with how beautiful soup works, when you want to crab a child of a tag. So, I have the following HTML code

<div class="media item avatar profile">
<a href="http://..." class="media-link action-medialink">
<img class="media-item-img" src="http://...jpeg" alt="name" title="name" width="150" height="200">
</a>
</div>    

I want to grab the src tag. I am using the following code:

soup = BeautifulSoup(file_)
for x in soup.find('div', attrs={'class':'media item avatar profile'}).findNext('img'):
    print x 

This prints the whole img tag. How do i select only the src ?

Thank you.

like image 468
evi Avatar asked Apr 10 '13 07:04

evi


1 Answers

src is an attribute of the tag. Once you have the tag, access the attributes as you would dictionary keys; you only found the a tag so you need to navigate to the contained img tag too:

for x in soup.find_all('div', attrs={'class':'media item avatar profile'}):
    print x.a.img['src']

Your code used findNext() which returns a tag object; looping over that gives you the children, so x was the img object. I changed this to be a bit more direct and clearer. x is now the div, and we navigate directly to the first a and contained img tag.

like image 88
Martijn Pieters Avatar answered Oct 07 '22 03:10

Martijn Pieters