How to find children of nodes using BeautifulSoup

People also ask

What is Find () method in BeautifulSoup?

find() method The find method is used for finding out the first tag with the specified name or id and returning an object of type bs4. Example: For instance, consider this simple HTML webpage having different paragraph tags.

How do I find a specific element with BeautifulSoup?

BeautifulSoup has a limited support for CSS selectors, but covers most commonly used ones. Use select() method to find multiple elements and select_one() to find a single element.

How do you get a href value in BeautifulSoup?

To get href with Python BeautifulSoup, we can use the find_all method. to create soup object with BeautifulSoup class called with the html string. Then we find the a elements with the href attribute returned by calling find_all with 'a' and href set to True .

Try this

li = soup.find('li', {'class': 'text'})
children = li.findChildren("a" , recursive=False)
for child in children:
    print(child)

There's a super small section in the DOCs that shows how to find/find_all direct children.

https://www.crummy.com/software/BeautifulSoup/bs4/doc/#the-recursive-argument

In your case as you want link1 which is first direct child:

# for only first direct child
soup.find("li", { "class" : "test" }).find("a", recursive=False)

If you want all direct children:

# for all direct children
soup.find("li", { "class" : "test" }).findAll("a", recursive=False)

Perhaps you want to do

soup.find("li", { "class" : "test" }).find('a')

try this:

li = soup.find("li", { "class" : "test" })
children = li.find_all("a") # returns a list of all <a> children of li

other reminders:

The find method only gets the first occurring child element. The find_all method gets all descendant elements and are stored in a list.

"How to find all a which are children of <li class=test> but not any others?"

Given the HTML below (I added another <a> to show te difference between select and select_one):

<div>
  <li class="test">
    <a>link1</a>
    <ul>
      <li>
        <a>link2</a>
      </li>
    </ul>
    <a>link3</a>
  </li>
</div>

The solution is to use child combinator (>) that is placed between two CSS selectors:

>>> soup.select('li.test > a')
[<a>link1</a>, <a>link3</a>]

In case you want to find only the first child:

>>> soup.select_one('li.test > a')
<a>link1</a>

Related questions
                            
                                Python xml ElementTree from a string source?
                            
                                Python logging not outputting anything
                            
                                Running Selenium WebDriver python bindings in chrome
                            
                                Appending a list or series to a pandas DataFrame as a row?
                            
                                How to write a file or data to an S3 object using boto3
                            
                                Coroutine vs Continuation vs Generator
                            
                                Can iterators be reset in Python?
                            
                                Python how to write to a binary file?
                            
                                Difference between numpy dot() and Python 3.5+ matrix multiplication @
                            
                                Should I use `import os.path` or `import os`?
                            
                                Difference between subprocess.Popen and os.system
                            
                                how to change any data type into a string in python
                            
                                How can I make setuptools install a package that's not on PyPI?
                            
                                How do I write a "tab" in Python?
                            
                                Extracting double-digit months and days from a Python date [duplicate]
                            
                                Python: TypeError: cannot concatenate 'str' and 'int' objects [duplicate]
                            
                                How can I read a function's signature including default argument values?
                            
                                Visibility of global variables in imported modules
                            
                                Disable individual Python unit tests temporarily
                            
                                filename and line number of Python script

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to find children of nodes using BeautifulSoup

Tags:

python

html

beautifulsoup

People also ask

Recent Activity

Donate For Us