lxml equivalent to BeautifulSoup "OR" syntax?

Question

I'm converting some html parsing code from BeautifulSoup to lxml. I'm trying to figure out the lxml equivalent syntax for the following BeautifullSoup statement:

soup.find('a', {'class': ['current zzt', 'zzt']})

Basically I want to find all of the "a" tags in the document that have a class attribute of either "current zzt" or "zzt". BeautifulSoup allows one to pass in a list, dictionary, or even a regular express to perform the match.

What is the lxml equivalent?

Thanks!

joeforker · Accepted Answer

No, lxml does not provide the "find first or return None" method you're looking for. Just use (select(soup) or [None])[0] if you need that, or write a function to do it for you.

#!/usr/bin/python
import lxml.html
import lxml.cssselect
soup = lxml.html.fromstring("""
        <html>
        <a href="foo" class="yyy zzz" />
        <a href="bar" class="yyy" />
        <a href="baz" class="zzz" />
        <a href="quux" class="zzz yyy" />
        <a href="warble" class="qqq" />
        <p class="yyy zzz">Hello</p>
        </html>""")

select = lxml.cssselect.CSSSelector("a.yyy.zzz, a.yyy")
print [lxml.html.tostring(s).strip() for s in select(soup)]
print (select(soup) or [None])[0]

Ok, so soup.find('a') would indeed find first a element or None as you expect. Trouble is, it doesn't appear to support the rich XPath syntax needed for CSSSelector.

lxml equivalent to BeautifulSoup "OR" syntax?

Tags:

python

beautifulsoup

lxml

erikcw

1 Answers

joeforker

Recent Activity

Donate For Us

lxml equivalent to BeautifulSoup "OR" syntax?

Tags:

python

beautifulsoup

lxml

erikcw

1 Answers

joeforker

Related questions

Recent Activity

Donate For Us