Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How to check for RSS updates with feedparser and etags

I'm trying to skip over RSS feeds that have not been modified using feedparser and etags. Following the guidelines of the documentation: http://pythonhosted.org/feedparser/http-etag.html

import feedparser

d = feedparser.parse('http://www.wired.com/wiredscience/feed/')
d2 = feedparser.parse('http://www.wired.com/wiredscience/feed/', etag=d.etag)

print d2.status

This outputs:

200

Shouldn't this script return a 304? My understanding is that when the RSS feed gets updated the etag changes and if they match then I should get a 304.

How come I am not getting my expected result?

like image 575
Marc Avatar asked May 24 '13 23:05

Marc


1 Answers

Apparently this server is configured to check 'If-Modified-Since' header. You need to pass last modified time as well:

>>> d = feedparser.parse('http://www.wired.com/wiredscience/feed/')
>>> feedparser.parse('http://www.wired.com/wiredscience/feed/', 
                     etag=d.etag, modified=d.modified).status
304
>>> feedparser.parse('http://www.wired.com/wiredscience/feed/', 
                     etag=d.etag).status
200
like image 169
Pavel Strakhov Avatar answered Sep 28 '22 05:09

Pavel Strakhov