Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting tag after tag?

I'm trying to get the first tag after another tag with beautifulsoup.

Let's suppose i have this:

<span class="number">5</span>
<span class="b">xxx</span><span class "number">10</span>

I could get the number on the second .number with a regex and it would be pretty solid. But we all know regex isn't supposed to parse html, so i'm doing this with beautifulsoup. Currently i'm doing this with

soup('span', {'class': 'number'})[1].string

but, if at another span.number is inserted before the one i want, it will break the code, since the one i need will become [2].

Is there any way to use beautifulsoup to get the first span.number AFTER span.b?

like image 762
Leonardo Arroyo Avatar asked Oct 04 '22 23:10

Leonardo Arroyo


1 Answers

You could use next_sibling to get the next tag after <span class="b">:

import bs4 as bs


content = '''<span class="number">5</span>
<span class="b">xxx</span><span class "number">10</span>'''

soup = bs.BeautifulSoup(content)
print(soup('span', {'class': 'b'})[0].next_sibling)
# <span class="">10</span>

print(soup('span', {'class': 'b'})[0].next_sibling.string)
# 10

If you are using BeautifulSoup version 3, the equivalent attribute is called nextSibling.

like image 50
unutbu Avatar answered Oct 10 '22 21:10

unutbu