I can parse the full argument of a html Tag addressing it over a unix shell script like this:
# !/usr/bin/python3
# import the module
from bs4 import BeautifulSoup
# define your object
soup = BeautifulSoup(open("test.html"))
# get the tag
print(soup(itemprop="name"))
where itemprop="name" uniquely identifies the required tag. 
the output is something like
[<span itemprop="name">
                    Blabla & Bloblo</span>]
Now I would like to return only the Bla Bla Blo Blo part.
my attempt was to do:
print(soup(itemprop="name").getText())
but I get an error message like AttributeError: 'ResultSet' object has no attribute 'getText'
it worked experimentally in other contexts such as
print(soup.find('span').getText())
So what am I getting wrong?
Using the soup object as a callable returns a list of results, as if you used soup.find_all(). See the documentation:
Because
find_all()is the most popular method in the Beautiful Soup search API, you can use a shortcut for it. If you treat theBeautifulSoupobject or aTagobject as though it were a function, then it’s the same as callingfind_all()on that object.
Use soup.find() to find just the first match:
soup.find(itemprop="name").get_text()
or index into the resultset:
soup(itemprop="name")[0].get_text()
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With