Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Basic Python/Beautiful Soup Parsing [duplicate]

Say I have use

date = r.find('abbr')

to get

<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>

I just want to print November 16, 2012, but if I try

print date.string

I get

AttributeError: 'NoneType' object has no attribute 'string'

What am I doing wrong?

UPDATE: Here's my code Neither of the print pairs print the raw string, but the uncommented ones get the correct tags

from BeautifulSoup import BeautifulSoup
page = urllib2.urlopen("some-url-path")
soup = BeautifulSoup(page)
calendar = soup.find('table',{"class" : "vcalendar ical"})
for r in calendar.findAll('tr'):
#   print ''.join(r.findAll('abbr',text=True))
#   print ''.join(r.findAll('strong',text=True))
    print r.find('abbr')
    print r.find('strong')
like image 590
kevlar1818 Avatar asked Mar 29 '26 23:03

kevlar1818


1 Answers

soup.find('abbr').string should work fine. There must be something wrong with date.

from BeautifulSoup import BeautifulSoup

doc = '<abbr class="dtstart" title="2012-11-16T00:00:00-05:00">November 16, 2012</abbr>'

soup = BeautifulSoup(doc)

for abbr in soup.findAll('abbr'):
    print abbr.string

Result:

November 16, 2012

Update based on code added to question:

You can't use the text parameter like that.

http://www.crummy.com/software/BeautifulSoup/documentation.html#arg-text

text is an argument that lets you search for NavigableString objects instead of Tags

Either you're looking for text nodes, or you're looking for tags. A text node can't have a tag name.

Maybe you want ''.join([el.string for el in r.findAll('strong')])?

like image 130
Acorn Avatar answered Apr 02 '26 07:04

Acorn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!