Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup4: select elements where attributes are not equal to x

I'd like to do something like this:

soup.find_all('td', attrs!={"class":"foo"})

I want to find all td that do not have the class of foo.
Obviously the above doesn't work, what does?

like image 324
kylex Avatar asked May 22 '14 04:05

kylex


People also ask

What is the difference between Find_all () and find () in beautiful soup?

find is used for returning the result when the searched element is found on the page. find_all is used for returning all the matches after scanning the entire document.

What is beautifulsoup4 in Python?

Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

What does BeautifulSoup Select Return?

BeautifulSoup has a . select() method which uses the SoupSieve package to run a CSS selector against a parsed document and return all the matching elements.

What is BeautifulSoup prettify?

The prettify() method will turn a Beautiful Soup parse tree into a nicely formatted Unicode string, with a separate line for each tag and each string: Python3.


2 Answers

BeautifulSoup really makes the "soup" beautiful and easy to work with.

You can pass a function in the attribute value:

soup.find_all('td', class_=lambda x: x != 'foo')

Demo:

>>> from bs4 import BeautifulSoup
>>> data = """
... <tr>
...     <td>1</td>
...     <td class="foo">2</td>
...     <td class="bar">3</td>
... </tr>
... """
>>> soup = BeautifulSoup(data)
>>> for element in soup.find_all('td', class_=lambda x: x != 'foo'):
...     print element.text
... 
1
3
like image 61
alecxe Avatar answered Oct 16 '22 07:10

alecxe


There is a method .select() which allows you to pass CSS selectors as a string:

soup.select('td:not(.foo)')

The above code will return all <td> tags which are not of the class foo.

like image 39
Phil Filippak Avatar answered Oct 16 '22 06:10

Phil Filippak