I'd like to do something like this: <pre class="prettyprint"><code>soup.find_all('td', attrs!={"class":"foo"}) </code></pre> I want to find all td that do not have the class of foo. Obviously the above doesn't work, what does?

<code>BeautifulSoup</code> really makes the "soup" beautiful and easy to work with. You can pass a function in the attribute value: <pre class="prettyprint"><code>soup.find_all('td', class_=lambda x: x != 'foo') </code></pre> Demo: <pre class="prettyprint"><code>>>> from bs4 import BeautifulSoup >>> data = """ ... <tr> ... <td>1</td> ... <td class="foo">2</td> ... <td class="bar">3</td> ... </tr> ... """ >>> soup = BeautifulSoup(data) >>> for element in soup.find_all('td', class_=lambda x: x != 'foo'): ... print element.text ... 1 3 </code></pre>

There is a method <code>.select()</code> which allows you to pass CSS selectors as a string: <pre class="prettyprint lang-py prettyprint-override"><code>soup.select('td:not(.foo)') </code></pre> The above code will return all <code><td></code> tags which are not of the class <code>foo</code>.

BeautifulSoup4: select elements where attributes are not equal to x

Tags:

python

html

html-parsing

beautifulsoup

python-2.7

I'd like to do something like this:

soup.find_all('td', attrs!={"class":"foo"})

I want to find all td that do not have the class of foo.
Obviously the above doesn't work, what does?

324

asked May 22 '14 04:05

kylex

2 Answers

BeautifulSoup really makes the "soup" beautiful and easy to work with.

You can pass a function in the attribute value:

soup.find_all('td', class_=lambda x: x != 'foo')

Demo:

>>> from bs4 import BeautifulSoup
>>> data = """
... <tr>
...     <td>1</td>
...     <td class="foo">2</td>
...     <td class="bar">3</td>
... </tr>
... """
>>> soup = BeautifulSoup(data)
>>> for element in soup.find_all('td', class_=lambda x: x != 'foo'):
...     print element.text
... 
1
3

answered Oct 16 '22 07:10

alecxe

There is a method .select() which allows you to pass CSS selectors as a string:

soup.select('td:not(.foo)')

The above code will return all <td> tags which are not of the class foo.

answered Oct 16 '22 06:10

Phil Filippak

Related questions
                            
                                Django multi-table inheritance, how to know which is the child class of a model?
                            
                                What does matplotlib `imshow(interpolation='nearest')` do?
                            
                                AuthAlreadyAssociated Exception in Django Social Auth
                            
                                matplotlib very slow. Is it normal?
                            
                                How is super() in Python 3 implemented?
                            
                                finding multiples of a number in Python
                            
                                How to add inline comments to multiline string assignments in python
                            
                                How are import statements in plpython handled?
                            
                                pycurl https error: unable to get local issuer certificate
                            
                                A Python "catch all" method for undefined/unimplemented attributes in classes
                            
                                Fixing "warning: GMP or MPIR library not found; Not building Crypto.PublickKey._fastmath" error on Python 2.7 with CentOS 6.4
                            
                                enforce column encoding with sqlalchemy
                            
                                UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 0: ordinal not in range(128)
                            
                                How to calculate auto-covariance in Python
                            
                                Is deleteLater() necessary in PyQt/PySide?
                            
                                Python - Decorators
                            
                                Why do numpy cov diagonal elements and var functions have different values?
                            
                                What is the proper way to take a directory path as user input?
                            
                                How is irange() any different from range() or xrange()?
                            
                                How to detect minimum version of python that a script required

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With