I'm playing with BeautifulSoup 4 and I have this html code: <pre class="prettyprint"><code></tr> <tr> <td id="freistoesse">Giraffe</td> <td>14</td> <td>7</td> </tr> </code></pre> I want to match both values between <code><td></code> tags so here 14 and 7. I tried this: <pre class="prettyprint"><code>giraffe = soup.find(text='Giraffe').findNext('td').text </code></pre> but this only matches <code>14</code>. How can I match both values with this function?

Use <code>find_all</code> instead of <code>findNext</code>: <pre class="prettyprint"><code>import bs4 as bs content = '''\ <tr> <td id="freistoesse">Giraffe</td> <td>14</td> <td>7</td> </tr>''' soup = bs.BeautifulSoup(content) for td in soup.find('td', text='Giraffe').parent.find_all('td'): print(td.text) </code></pre> yields <pre class="prettyprint"><code>Giraffe 14 7 </code></pre> <hr> Or, you could use <code>find_next_siblings</code> (also known as <code>fetchNextSiblings</code>): <pre class="prettyprint"><code>for td in soup.find(text='Giraffe').parent.find_next_siblings(): print(td.text) </code></pre> yields <pre class="prettyprint"><code>14 7 </code></pre> <hr> Explanation: Note that <code>soup.find(text='Giraffe')</code> returns a NavigableString. <pre class="prettyprint"><code>In [30]: soup.find(text='Giraffe') Out[30]: u'Giraffe' </code></pre> To get the associated <code>td</code> tag, use <pre class="prettyprint"><code>In [31]: soup.find('td', text='Giraffe') Out[31]: <td id="freistoesse">Giraffe</td> </code></pre> or <pre class="prettyprint"><code>In [32]: soup.find(text='Giraffe').parent Out[32]: <td id="freistoesse">Giraffe</td> </code></pre> Once you have the <code>td</code> tag, you could use <code>find_next_siblings</code>: <pre class="prettyprint"><code>In [35]: soup.find(text='Giraffe').parent.find_next_siblings() Out[35]: [<td>14</td>, <td>7</td>] </code></pre> <hr> PS. BeautifulSoup has added method names that use underscores instead of CamelCase. They do the same thing, but comform to the PEP8 style guide recommendations. Thus, prefer <code>find_next_siblings</code> over <code>fetchNextSiblings</code>.

BeautifulSoup 4, findNext() function

Tags:

python

beautifulsoup

python-2.7

I'm playing with BeautifulSoup 4 and I have this html code:

</tr>
          <tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>

I want to match both values between <td> tags so here 14 and 7.

I tried this:

giraffe = soup.find(text='Giraffe').findNext('td').text

but this only matches 14. How can I match both values with this function?

289

asked Apr 02 '13 18:04

nutship

1 Answers

Use find_all instead of findNext:

import bs4 as bs
content = '''\
<tr>
<td id="freistoesse">Giraffe</td>
<td>14</td>
<td>7</td>
</tr>'''
soup = bs.BeautifulSoup(content)

for td in soup.find('td', text='Giraffe').parent.find_all('td'):
    print(td.text)

yields

Giraffe
14
7

Or, you could use find_next_siblings (also known as fetchNextSiblings):

for td in soup.find(text='Giraffe').parent.find_next_siblings():
    print(td.text)

yields

14
7

Explanation:

Note that soup.find(text='Giraffe') returns a NavigableString.

In [30]: soup.find(text='Giraffe')
Out[30]: u'Giraffe'

To get the associated td tag, use

In [31]: soup.find('td', text='Giraffe')
Out[31]: <td id="freistoesse">Giraffe</td>

In [32]: soup.find(text='Giraffe').parent
Out[32]: <td id="freistoesse">Giraffe</td>

Once you have the td tag, you could use find_next_siblings:

In [35]: soup.find(text='Giraffe').parent.find_next_siblings()
Out[35]: [<td>14</td>, <td>7</td>]

PS. BeautifulSoup has added method names that use underscores instead of CamelCase. They do the same thing, but comform to the PEP8 style guide recommendations. Thus, prefer find_next_siblings over fetchNextSiblings.

145

answered Sep 26 '22 14:09

unutbu

Related questions
                            
                                Randomly shuffle a sparse matrix in python
                            
                                Confused about the choice between Python 2 vs Python 3 [closed]
                            
                                Does Sphinx run my code on executing 'make html'?
                            
                                How can I hide my stack frames in a TestCase subclass?
                            
                                Python: cannot import urandom module (OS X)
                            
                                How to do elif statments more elegantly if appending to array in python
                            
                                Exponential of very small number in python
                            
                                What determines the vertical space in Reportlab tables?
                            
                                Using __class__ to create instances
                            
                                How to declare 2D list in Cython
                            
                                Run a particular Python function in C# with IronPython
                            
                                Generating all unique pair permutations
                            
                                Integer division & modulo operation with negative operands in Python
                            
                                In python on OSX with HFS+ how can I get the correct case of an existing filename?
                            
                                disabling autoescape in flask
                            
                                Running more than one class in Cherrypy
                            
                                Matrix Multiplication of a Pandas DataFrame and Series
                            
                                Why does my contextmanager-function not work like my contextmanager class in python?
                            
                                Installing pytesser
                            
                                Write data to hdf file using multiprocessing

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With