I have a datatime data, their format is like 29062017
and 01AUG2017
.
As you can see, the month is in the middle of data.
I want to convert this data to datetime, when I use pd.to_datetime
, but it doesn't work.
Do you know a good way to solve this problem?
You can use pd.to_datetime
's format arg:
In [11]: s = pd.Series(["29062017", "01AUG2017"])
In [12]: pd.to_datetime(s, format="%d%m%Y", errors="coerce")
Out[12]:
0 2017-06-29
1 NaT
dtype: datetime64[ns]
In [13]: pd.to_datetime(s, format="%d%b%Y", errors="coerce")
Out[13]:
0 NaT
1 2017-08-01
dtype: datetime64[ns]
Note: the coerce
argument means that failures will be NaT
.
and fill in the NaN
s from one into the other e.g. using fillna
:
In [14]: pd.to_datetime(s, format="%d%m%Y", errors="coerce").fillna(pd.to_datetime(s, format="%d%b%Y", errors="coerce"))
Out[14]:
0 2017-06-29
1 2017-08-01
dtype: datetime64[ns]
Any strings that don't match either format will remain NaT.
The alternative would be to use a mapper and replace
to substitute month codes with their numerical equivalent:
s = pd.Series(["29062017", "01AUG2017"]); s
0 29062017
1 01AUG2017
dtype: object
m = {'JAN' : '01', ..., 'AUG' : '08', ...} # you fill in the rest
s = s.replace(m, regex=True); s
0 29062017
1 01082017
dtype: object
Now all you need is a single pd.to_datetime
call:
pd.to_datetime(s, format="%d%m%Y", errors="coerce")
0 2017-06-29
1 2017-08-01
dtype: datetime64[ns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With