How do I iterate over the HTML attributes of a Beautiful Soup element?
Like, given:
<foo bar="asdf" blah="123">xyz</foo>
I want "bar" and "blah".
To get href with Python BeautifulSoup, we can use the find_all method. to create soup object with BeautifulSoup class called with the html string. Then we find the a elements with the href attribute returned by calling find_all with 'a' and href set to True .
Using CSS selectors to locate elements in BeautifulSoupUse select() method to find multiple elements and select_one() to find a single element.
To use beautiful soup, you need to install it: $ pip install beautifulsoup4 . Beautiful Soup also relies on a parser, the default is lxml . You may already have it, but you should check (open IDLE and attempt to import lxml). If not, do: $ pip install lxml or $ apt-get install python-lxml .
from BeautifulSoup import BeautifulSoup
page = BeautifulSoup('<foo bar="asdf" blah="123">xyz</foo>')
for attr, value in page.find('foo').attrs:
print attr, "=", value
# Prints:
# bar = asdf
# blah = 123
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With