I am confused with how beautiful soup works, when you want to crab a child of a tag. So, I have the following HTML code
<div class="media item avatar profile">
<a href="http://..." class="media-link action-medialink">
<img class="media-item-img" src="http://...jpeg" alt="name" title="name" width="150" height="200">
</a>
</div>
I want to grab the src tag. I am using the following code:
soup = BeautifulSoup(file_)
for x in soup.find('div', attrs={'class':'media item avatar profile'}).findNext('img'):
print x
This prints the whole img tag. How do i select only the src ?
Thank you.
src
is an attribute of the tag. Once you have the tag, access the attributes as you would dictionary keys; you only found the a
tag so you need to navigate to the contained img
tag too:
for x in soup.find_all('div', attrs={'class':'media item avatar profile'}):
print x.a.img['src']
Your code used findNext()
which returns a tag object; looping over that gives you the children, so x
was the img
object. I changed this to be a bit more direct and clearer. x
is now the div
, and we navigate directly to the first a
and contained img
tag.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With