I am new to Pandas and Python. I want to do some date time operations in my script. I am getting date time information from a csv file in following format: 01APR2017 6:59
How to convert it into pandas datetime format? Something like: 2017-04-01 06:59:00
We can convert a string to datetime using strptime() function. This function is available in datetime and time modules to parse a string to datetime and time objects respectively.
Using strptime() , date and time in string format can be converted to datetime type. The first parameter is the string and the second is the date time format specifier. One advantage of converting to date format is one can select the month or date or time individually.
For non-standard datetime parsing, use pd.to_datetime after pd.read_csv . To parse an index or column with a mixture of timezones, specify date_parser to be a partially-applied pandas.to_datetime() with utc=True . See Parsing a CSV with mixed timezones for more.
Use datetime. strftime(format) to convert a datetime object into a string as per the corresponding format . The format codes are standard directives for mentioning in which format you want to represent datetime. For example, the %d-%m-%Y %H:%M:%S codes convert date to dd-mm-yyyy hh:mm:ss format.
You can use to_datetime
with parameter format
:
s = pd.Series(['01APR2017 6:59','01APR2017 6:59'])
print (s)
0 01APR2017 6:59
1 01APR2017 6:59
dtype: object
print (pd.to_datetime(s, format='%d%b%Y %H:%M'))
0 2017-04-01 06:59:00
1 2017-04-01 06:59:00
dtype: datetime64[ns]
Another possible solution is use date_parser
in read_csv
:
import pandas as pd
from pandas.compat import StringIO
temp=u"""date
01APR2017 6:59
01APR2017 6:59"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
parser = lambda x: pd.datetime.strptime(x, '%d%b%Y %H:%M')
df = pd.read_csv(StringIO(temp), parse_dates=[0], date_parser=parser)
print (df)
date
0 2017-04-01 06:59:00
1 2017-04-01 06:59:00
print (df.date.dtype)
datetime64[ns]
EDIT by comment:
If values cannot be parsed to datetime
, add parameter errors='coerce'
for convert to NaT
:
s = pd.Series(['01APR2017 6:59','01APR2017 6:59', 'a'])
print (s)
0 01APR2017 6:59
1 01APR2017 6:59
2 a
dtype: object
print (pd.to_datetime(s, format='%d%b%Y %H:%M', errors='coerce'))
0 2017-04-01 06:59:00
1 2017-04-01 06:59:00
2 NaT
dtype: datetime64[ns]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With