I have a data set with a column date like this:
cod date value
0 1O8 2015-01-01 00:00:00 2.1
1 1O8 2015-01-01 01:00:00 2.3
2 1O8 2015-01-01 02:00:00 3.5
3 1O8 2015-01-01 03:00:00 4.5
4 1O8 2015-01-01 04:00:00 4.4
5 1O8 2015-01-01 05:00:00 3.2
6 1O9 2015-01-01 00:00:00 1.4
7 1O9 2015-01-01 01:00:00 8.6
8 1O9 2015-01-01 02:00:00 3.3
10 1O9 2015-01-01 03:00:00 1.5
11 1O9 2015-01-01 04:00:00 2.4
12 1O9 2015-01-01 05:00:00 7.2
The dtypes
of column date is an object, for apply some function after I need to change the date column type to datatime. I try a diffrent solution like:
pd.to_datetime(df['date'], errors='raise', format ='%Y-%m-%d HH:mm:ss')
pd.to_datetime(df['date'], errors='coerce', format ='%Y-%m-%d HH:mm:ss')
df['date'].apply(pd.to_datetime, format ='%Y-%m-%d HH:mm:ss')
But the error is only the same:
TypeError: Unrecognized value type: <class 'str'>
ValueError: Unknown string format
The straight thing is that if I apply te function to a sample of data set, the function respond correctly, but if I apply it to all data set exit the error. In the data there isn missing value and the dtype is the same for all value.
How I can fix this error?
There are three issues:
pd.to_datetime
and pd.Series.apply
don't work in place, so your solutions won't modify your series. Assign back after conversion.errors='coerce'
to guarantee no errors.%
.So you can use:
df = pd.DataFrame({'date': ['2015-01-01 00:00:00', '2016-12-20 15:00:20',
'2017-08-05 00:05:00', '2018-05-11 00:10:00']})
df['date'] = pd.to_datetime(df['date'], errors='coerce', format='%Y-%m-%d %H:%M:%S')
print(df)
date
0 2015-01-01 00:00:00
1 2016-12-20 15:00:20
2 2017-08-05 00:05:00
3 2018-05-11 00:10:00
In this particular instance, the format is standard and can be omitted:
df['date'] = pd.to_datetime(df['date'], errors='coerce')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With