Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python and Beautiful soup, pick up All elements [duplicate]

I'm getting a text article from one website with help of python and BeatifulSoup. Now I have strange problem... I just wana print out the text inside multiple p tags which are located in div with class dr_article. Now the with code looking like this:

from bs4 import BeautifulSoup

def getArticleText(webtext):
soup = BeautifulSoup(webtext)
divTag = soup.find_all("div", {"class":"dr_article"})
for tag in divTag:
    pData = tag.find_all("p").text
    print pData

I'm getting following error:

Traceback (most recent call last):
  File "<pyshell#14>", line 1, in <module>
execfile("word_rank/main.py")
  File "word_rank/main.py", line 7, in <module>
articletext.getArticleText(webtext)
  File "word_rank\articletext.py", line 7, in getArticleText
pData = tag.find_all("p").text
AttributeError: 'list' object has no attribute 'text'

But when I choose just the first element with [0] before .text I'm not getting the error and it works as supposed to. It gets first element text. To be precise I modify my code and it looks like this:

from bs4 import BeautifulSoup

def getArticleText(webtext):
soup = BeautifulSoup(webtext)
divTag = soup.find_all("div", {"class":"dr_article"})
for tag in divTag:
    pData = tag.find_all("p")[0].text
    print pData

My question is how can I get text from all element at once? What to modify so I would not get text from only one element but from all?

like image 371
dzordz Avatar asked Aug 01 '13 09:08

dzordz


1 Answers

You are getting all element, so the function returns the list. Try to go through it:

from bs4 import BeautifulSoup

def getArticleText(webtext):
    soup = BeautifulSoup(webtext)
    divTag = soup.find_all("div", {"class":"dr_article"})
    for tag in divTag:
        for element in tag.find_all("p"):
            pData = element.text
            print pData

Or you can select each element separately:

tag.find_all("p")[0].text
tag.find_all("p")[1].text
tag.find_all("p")[..].text
tag.find_all("p")[N - 1].text
tag.find_all("p")[N].text
like image 125
4d4c Avatar answered Oct 21 '22 17:10

4d4c