I have a huge size DataFrame that contains index in integer form for date time representation, for example, 20171001. What I'm going to do is to change the form, for example, 20171001, to the datetime format, '2017-10-01'.
For simplicity, I generate such a dataframe.
>>> df = pd.DataFrame(np.random.randn(3,2), columns=list('ab'), index=
[20171001,20171002,20171003])
>>> df
a b
20171001 2.205108 0.926963
20171002 1.104884 -0.445450
20171003 0.621504 -0.584352
>>> df.index
Int64Index([20171001, 20171002, 20171003], dtype='int64')
If we apply 'to_datetime' to df.index, we have the weird result:
>>> pd.to_datetime(df.index)
DatetimeIndex(['1970-01-01 00:00:00.020171001',
'1970-01-01 00:00:00.020171002',
'1970-01-01 00:00:00.020171003'],
dtype='datetime64[ns]', freq=None)
What I want is DatetimeIndex(['2017-10-01', '2017-10-02', '2017-10--3'], ...)
How can I manage this problem? Note that the file is given.
Use format %Y%m%d in pd.to_datetime i.e
pd.to_datetime(df.index, format='%Y%m%d')
DatetimeIndex(['2017-10-01', '2017-10-02', '2017-10-03'], dtype='datetime64[ns]', freq=None)
To assign df.index = pd.to_datetime(df.index, format='%Y%m%d')
pd.to_datetime is the panda way of doing it. But here are two alternatives:
import datetime
df.index = (datetime.datetime.strptime(str(i),"%Y%m%d") for i in df.index)
or
import datetime
df.index = df.index.map(lambda x: datetime.datetime.strptime(str(x),"%Y%m%d"))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With