I need to display RSS-feeds with Python, Atom for the most part. Coming from PHP, where I could get values pretty fast with $entry->link i find lxml to be much more precise, faster, albeit complicated. After hours of probing I got this working with the arstechnica-feed:
def GetRSSFeed(url):
out = []
feed = urllib.urlopen(url)
feed = etree.parse(feed)
feed = feed.getroot()
for element in feed.iterfind(".//item"):
meta = element.getchildren()
title = meta[0].text
link = meta[1].text
for subel in element.iterfind(".//description"):
desc = subel.text
entry = [title,link,desc]
out.append(entry)
return out
Could this be done any easier? How can I access tags directly? Feedparser gets the job done with one line of code! Why?
Look at the feedparser library. It gives you a nicely formatted RSS object.
> import feedparser
> feed = feedparser.parse('http://feeds.marketwatch.com/marketwatch/marketpulse/')
> print feed.keys()
['feed',
'status',
'updated',
'updated_parsed',
'encoding',
'bozo',
'headers',
'etag',
'href',
'version',
'entries',
'namespaces']
> len(feed.entries)
30
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With