Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select all div siblings by using BeautifulSoup

I have an html file which has a structure like the following:

<div>
</div

<div>
</div>

<div>
  <div>
  </div>
  <div>
  </div>
  <div>
  </div>
<div>

<div>
  <div>
  </div>
</div>

I would like to select all the siblings div without selecting nested div in the third and fourth block. If I use find_all() I get all the divs.

like image 466
Mazzy Avatar asked Nov 16 '25 09:11

Mazzy


1 Answers

You can find direct children of the parent element:

soup.select('body > div')

to get all div elements under the top-level body tag.

You could also find the first div, then grab all matching siblings with Element.find_next_siblings():

first_div = soup.find('div')
all_divs = [first_div] + first_div.find_next_siblings('div')

Or you could use the element.children generator and filter those:

all_divs = (elem for elem in top_level.children if getattr(elem, 'name', None) == 'div')

where top_level is the element containing these div elements directly.

like image 104
Martijn Pieters Avatar answered Nov 18 '25 00:11

Martijn Pieters



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!