سؤال

I need to get the published field of an RSS feed, and I need to know what the timezone is. I am storing the date in UTC, and I want another field to store the timezone so that I can later manipulate the datetime.

My current code is as follows:

for entry in feed['entries']:
    if hasattr(entry, 'published'):
        if isinstance(entry.published_parsed, struct_time):
            dt = datetime(*entry.published_parsed[:-3])

The final value of dt is the correct datetime in UTC, but I need also to get the original timezone. Can anyone help?

EDIT:

For future reference, even though it is not part of my original question, if you need to manipulate a non standard timezone (like est), you need to make a conversion table per your specification. Thanks to this answer: Parsing date/time string with timezone abbreviated name in Python?

هل كانت مفيدة؟

المحلول

You can use parser.parse method of dateutil package.

For example for statckoverflow:

import feedparser
from dateutil import parser, tz

url = 'http://stackoverflow.com/feeds/tag/python'
feed = feedparser.parse(url)
published = feed.entries[0].published
dt = parser.parse(published)

print(published)
print(dt) # that is timezone aware
print(dt.utcoffset()) # time zone of time
print(dt.astimezone(tz.tzutc())) # that is timezone aware as UTC

2012-11-28T19:07:32Z
2012-11-28 19:07:32+00:00
0:00:00
2012-11-28 19:07:32+00:00

You can see that published ends with Z , it means timezone is in UTC:

Looks at History of Date Formats for it in feedparser:

Atom 1.0 states that all date elements “MUST conform to the date-time 
production in RFC 3339. 
In addition, an uppercase T character MUST be used to separate date and time, 
and an uppercase Z character MUST be present in the absence of 
a numeric time zone offset.”

And for another example:

import feedparser
from dateutil import parser, tz

url = 'http://omidraha.com/rss/'
feed = feedparser.parse(url)
published = feed.entries[0].published
dt = parser.parse(published)

print(published)
print(dt) # that is timezone aware
print(dt.utcoffset()) # time zone of time
print(dt.astimezone(tz.tzutc())) # that is timezone aware as UTC

Thu, 26 Dec 2013 14:24:04 +0330
2013-12-26 14:24:04+03:30
3:30:00
2013-12-26 10:54:04+00:00

But also it depends on the format of time data that received and type of feed.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top