I am having some difficulty working with times/timezones. I have raw JSON data of the form
{
"Date": "28 Sep 2009 00:00:00",
....
}
This data is then loaded into MongoDB, and this string representation of the date is transformed into a JavaScript Date object. This conversion to UTC time results in the following date
{
"_id": ObjectId("577a788f4439e17afd4e21f7"),
"Date": ISODate("2009-09-27T23:00:00Z")
}
It "looks" as though the date has actually been moved forward a day, I'm assuming (perhaps incorrectly) that this is because my machine is set to Irish Standard Time.
I then read this data from MongoDB and use it to create a pandas DatetimeIndex
idx = pd.DatetimeIndex([x['Date'] for x in test_docs], freq='D')
which gives me
which is incorrect since the time has not been converted back correctly from UTC to local time. So I followed the solution given in this answer
idx = pd.DatetimeIndex([x['Date'] for x in test_docs], freq='D')
idx = idx.tz_localize(tz=tz.tzutc())
idx = idx.tz_convert(tz=tz.tzlocal())
frame = DataFrame(test_docs, index=idx)
frame = frame.drop('Date', 1)
which gives me the right day back
I then normalize the DatetimeIndex so the hours are removed, allowing me to group all entries by day.
frame.groupby(idx).sum()
At this point, however, something strange happens. The dates end up getting grouped as follows
but this doesn't reflect the dates in the frame
Can anyone shed some light on where I might be going wrong?
idx = pd.DatetimeIndex([x['Date'] for x in test_docs], freq='D')
idx = idx.tz_localize(tz=tz.tzutc())
idx = idx.tz_convert(tz='Europe/Dublin')
idx = idx.normalize()
frame = DataFrame(test_docs, index=idx)
...
...
aggregate = frame.groupby(idx).sum()
aggregate.plot()
this doesn't work for me, it results in the following plot
For some reason the groupby is not properly grouping for 2014, as shown below
If instead, I use
idx = idx.tz_convert(tz.gettz('Europe/Dublin'))
I get the same problem
idx = pd.DatetimeIndex([x['Date'] for x in test_docs], freq='D')
idx = idx.tz_localize(tz=tz.tzutc())
idx = idx.tz_convert(tz=tz.tzlocal())
idx = idx.normalize()
frame = DataFrame(test_docs, index=idx)
aggregate = frame.groupby(idx.astype(object)).sum()
This approach seems to work correctly for me
date attribute outputs an Index object containing the date values present in each of the entries of the DatetimeIndex object. Example #1: Use DatetimeIndex. date attribute to find the date part of the DatetimeIndex object.
Use df. dates1-df. dates2 to find the difference between the two dates and then convert the result in the form of months.
ISODate() is a helper function that's built into to MongoDB and wraps the native JavaScript Date object. When you use the ISODate() constructor from the Mongo shell, it actually returns a JavaScript Date object.
I was able to reproduce the error with the following data:
idx0 = pd.date_range('2011-11-11', periods=4)
idx1 = idx0.tz_localize(tz.tzutc())
idx2 = idx1.tz_convert(tz.tzlocal())
df = pd.DataFrame([1, 2, 3, 4])
df.groupby(idx2).sum()
Out[20]:
0
1970-01-01 00:00:00-05:00 9
2011-11-10 19:00:00-05:00 1
It's a bug deep in the pandas code, related exclusively to tz.tzlocal()
. It manifests itself also in:
idx2.tz_localize(None)
Out[27]:
DatetimeIndex(['2011-11-10 19:00:00', '1970-01-01 00:00:00',
'1970-01-01 00:00:00', '1970-01-01 00:00:00'],
dtype='datetime64[ns]', freq='D')
You can use any of the following solutions:
use explicitly your timezone as a string:
idx2 = idx1.tz_convert(tz='Europe/Dublin')
df.groupby(idx2).sum()
Out[29]:
0
2011-11-11 00:00:00+00:00 1
2011-11-12 00:00:00+00:00 2
2011-11-13 00:00:00+00:00 3
2011-11-14 00:00:00+00:00 4
or if it doesn't work:
idx2 = idx1.tz_convert(tz.gettz('Europe/Dublin'))
convert it to an object:
df.groupby(idx2.astype(object)).sum()
Out[32]:
0
2011-11-10 19:00:00-05:00 1
2011-11-11 19:00:00-05:00 2
2011-11-12 19:00:00-05:00 3
2011-11-13 19:00:00-05:00 4
Basically, converting to anything else than DatetimeIndex with tz=tz.local()
should work.
EDIT: This bug has been just fixed on pandas github. The fix will be available in pandas 0.19 release.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With