Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing different date formats from feedparser in python?

I'm trying to get the dates from entries in two different RSS feeds through feedparser.

Here is what I'm doing:

import feedparser as fp
reddit = fp.parse("http://www.reddit.com/.rss")
cc = fp.parse("http://contentconsumer.com/feed")
print reddit.entries[0].date
print cc.entries[0].date

And here's how they come out:

2008-10-21T22:23:28.033841+00:00

Wed, 15 Oct 2008 10:06:10 +0000

I want to get to the point where I can find out which is newer easily.

I've tried using the datetime module of Python and searching through the feedparser documentation, but I can't get past this problem. Any help would be much appreciated.

like image 214
Alistair Avatar asked Oct 22 '08 11:10

Alistair


1 Answers

Parsing of dates is a pain with RSS feeds in-the-wild, and that's where feedparser can be a big help.

If you use the *_parsed properties (like updated_parsed), feedparser will have done the work and will return a 9-tuple Python date in UTC.

See http://packages.python.org/feedparser/date-parsing.html for more gory details.

like image 134
Martin Kenny Avatar answered Nov 06 '22 08:11

Martin Kenny