I'm trying to skip over RSS feeds that have not been modified using feedparser and etags. Following the guidelines of the documentation: http://pythonhosted.org/feedparser/http-etag.html
import feedparser
d = feedparser.parse('http://www.wired.com/wiredscience/feed/')
d2 = feedparser.parse('http://www.wired.com/wiredscience/feed/', etag=d.etag)
print d2.status
This outputs:
200
Shouldn't this script return a 304? My understanding is that when the RSS feed gets updated the etag changes and if they match then I should get a 304.
How come I am not getting my expected result?
Apparently this server is configured to check 'If-Modified-Since' header. You need to pass last modified time as well:
>>> d = feedparser.parse('http://www.wired.com/wiredscience/feed/')
>>> feedparser.parse('http://www.wired.com/wiredscience/feed/',
etag=d.etag, modified=d.modified).status
304
>>> feedparser.parse('http://www.wired.com/wiredscience/feed/',
etag=d.etag).status
200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With