Hi i want the description of an App in the Google Playstore. (https://play.google.com/store/apps/details?id=com.wetter.androidclient&hl=de)
import urllib2
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen("https://play.google.com/store/apps/details?id=com.wetter.androidclient&hl=de"))
result = soup.find_all("div", {"class":"show-more-content text-body"})
With this code i get the whole content in this class. But i can't get only the text in it. I tried a lot of things with next_silbing or .text but it always throws errors(ResultSet has no attribute xxx).
I just want to get the text like this: "Die Android App von wetter.com! Sie erhalten: ..:"
Anyone can help me?
To get all the child nodes of an element in Beautiful Soup, use the find_all() method.
Approach: Here we first import the regular expressions and BeautifulSoup libraries. Then we open the HTML file using the open function which we want to parse. Then using the find_all function, we find a particular tag that we pass inside that function and also the text we want to have within the tag.
Use the .text
attribute on the elements; you have a list of results, so loop:
for res in result:
print(res.text)
.text
is a property that proxies for the Element.get_text()
method.
Alternatively, if there is only ever supposed to be one such <div>
, use .find()
instead of .find_all()
:
result = soup.find("div", {"class":"show-more-content text-body"})
print(result.text)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With