Domanda

I'm trying to skip over RSS feeds that have not been modified using feedparser and etags. Following the guidelines of the documentation: http://pythonhosted.org/feedparser/http-etag.html

import feedparser

d = feedparser.parse('http://www.wired.com/wiredscience/feed/')
d2 = feedparser.parse('http://www.wired.com/wiredscience/feed/', etag=d.etag)

print d2.status

This outputs:

200

Shouldn't this script return a 304? My understanding is that when the RSS feed gets updated the etag changes and if they match then I should get a 304.

How come I am not getting my expected result?

È stato utile?

Soluzione

Apparently this server is configured to check 'If-Modified-Since' header. You need to pass last modified time as well:

>>> d = feedparser.parse('http://www.wired.com/wiredscience/feed/')
>>> feedparser.parse('http://www.wired.com/wiredscience/feed/', 
                     etag=d.etag, modified=d.modified).status
304
>>> feedparser.parse('http://www.wired.com/wiredscience/feed/', 
                     etag=d.etag).status
200
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top