I'm analyzing web server log files and having date time in following format.
02/Apr/2013:23:55:00 +0530
I'm converting this into pandas date-time format.
df['Time'] = pd.to_datetime(df['Time'])
But still it is in the object format.
print df.dtypes
Time object
why it is not changing to datetime64[ns]
?
Numpy version
In [2]: np.__version__
Out[2]: '1.8.0'
The date column is indeed a string, which—remember—is denoted as an object type in Python. You can convert it to the datetime type with the . to_datetime() method in pandas .
datetime64[ns] is a general dtype, while <M8[ns] is a specific dtype. General dtypes map to specific dtypes, but may be different from one installation of NumPy to the next. However, on a big endian machine, np.
We can convert a string to datetime using strptime() function. This function is available in datetime and time modules to parse a string to datetime and time objects respectively.
Function usedstrftime() can change the date format in python.
Sorry if I missed something...
df['Time'] = df['Time'].astype('datetime64')
Following answer depends on your python version.
Pandas' to_datetime
can't recognize your custom datetime format, you should provide it explicetly:
>>> import pandas as pd
>>> from datetime import datetime
>>> df = pd.DataFrame({'Time':['02/Apr/2013:23:55:00 +0530']},index=['tst'])
>>> from functools import partial
>>> to_datetime_fmt = partial(pd.to_datetime, format='%d/%b/%Y:%H:%M:%S %z')
and apply this custom converter
>>> df['Time'] = df['Time'].apply(to_datetime_fmt)
>>> df.dtypes
Time datetime64[ns]
dtype: object
Note, however that it works from python version 3.2, in earlier versions %z
is unsupported. You have to add timedelta manually.
>>> from datetime import timedelta
>>> df = pd.DataFrame({'Time':['02/Apr/2013:23:55:00 +0530']},index=['tst'])
Split time into datetime and offset
>>> def strptime_with_offset(string, format='%d/%b/%Y:%H:%M:%S'):
... base_dt = datetime.strptime(string[:-6], format)
... offset = int(string[-6:])
... delta = timedelta(hours=offset/100, minutes=offset%100)
... return base_dt + delta
...
and apply this conversion function:
>>> df['Time'] = df['Time'].apply(strptime_with_offset)
>>> df['Time']
tst 2013-04-03 05:25:00
Name: Time, dtype: datetime64[ns]
>>> df.dtypes
Time datetime64[ns]
dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With