I've researched this question but haven't seen an actual solution to solving this. I'm using BeautifulSoup with Python and what I'm looking to do is get all image tags from a page, loop through each and check each to see if it's immediate parent is an anchor tag.
Here's some pseudo code:
html = BeautifulSoup(responseHtml)
for image in html.findAll('img'):
if (image.parent.name == 'a'):
image.hasParent = image.parent.link
Any ideas on this?
To get all parent tags of a tag in Beautiful Soup, use the Tag. parents property, which returns a generator for iterating over the parent tags.
BeautifulSoup has a limited support for CSS selectors, but covers most commonly used ones. Use select() method to find multiple elements and select_one() to find a single element.
You need to check parent
's name
:
for img in soup.find_all('img'):
if img.parent.name == 'a':
print "Parent is a link"
Demo:
>>> from bs4 import BeautifulSoup
>>>
>>> data = """
... <body>
... <a href="google.com"><img src="image.png"/></a>
... </body>
... """
>>> soup = BeautifulSoup(data)
>>> img = soup.img
>>>
>>> img.parent.name
a
You can also retrieve the img
tags that have a direct a
parent using a CSS selector:
soup.select('a > img')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With