Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restricting RSS elements by date with feedparser. [Python]

I iterate a RSS feed like so where _file is the feed

d = feedparser.parse(_file)
for element in d.entries: 
    print repr(element.date)

The date output comes out like so

u'Thu, 16 Jul 2009 15:18:22 EDT'

I cant seem to understand how to actually quantify the above date output so I can use it to limit feed elements. I So what I am asking is how can I get a actual time out of this, so I can say if greater then 7 days old, skip this element.

like image 411
Recursion Avatar asked Dec 12 '25 01:12

Recursion


1 Answers

feedparser is supposed to give you a struct_time object from Python's time module. I'm guessing it doesn't recognize that date format and so is giving you the raw string.

See here on how to add support for parsing malformed timestamps:

http://pythonhosted.org/feedparser/date-parsing.html

If you manage to get it to give you the struct_time, you can read more about that here:

http://docs.python.org/library/time.html#time.struct_time

struct_time objects have everything you need. They have these members:

time.struct_time(tm_year=2010, tm_mon=2, tm_mday=4, tm_hour=23, tm_min=44, tm_sec=19, tm_wday=3, tm_yday=35, tm_isdst=0)

I generally convert the structs to seconds, like this:

import time
import calendar

struct = time.localtime()
seconds = calendar.timegm(struct)

Then you can just do regular math to see how many seconds have elapsed, or use the datetime module to do timedeltas.

like image 186
FogleBird Avatar answered Dec 13 '25 16:12

FogleBird



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!