Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Change object type in to datetime64[ns]-pandas

I'm analyzing web server log files and having date time in following format.

02/Apr/2013:23:55:00 +0530

I'm converting this into pandas date-time format.

df['Time'] = pd.to_datetime(df['Time'])

But still it is in the object format.

print df.dtypes

Time object

why it is not changing to datetime64[ns]?

Numpy version

In [2]: np.__version__
Out[2]: '1.8.0'
like image 646
Nilani Algiriyage Avatar asked Nov 04 '13 08:11

Nilani Algiriyage


People also ask

How do you change the datatype of an object to a date in python?

The date column is indeed a string, which—remember—is denoted as an object type in Python. You can convert it to the datetime type with the . to_datetime() method in pandas .

What type is datetime64 ns?

datetime64[ns] is a general dtype, while <M8[ns] is a specific dtype. General dtypes map to specific dtypes, but may be different from one installation of NumPy to the next. However, on a big endian machine, np.

How do you convert an object to time in python?

We can convert a string to datetime using strptime() function. This function is available in datetime and time modules to parse a string to datetime and time objects respectively.

How do I change the date format in a DataFrame in Python?

Function usedstrftime() can change the date format in python.


2 Answers

Sorry if I missed something...

df['Time'] = df['Time'].astype('datetime64')
like image 133
Dmitri K. Avatar answered Sep 22 '22 08:09

Dmitri K.


Following answer depends on your python version.

Pandas' to_datetime can't recognize your custom datetime format, you should provide it explicetly:

>>> import pandas as pd
>>> from datetime import datetime
>>> df = pd.DataFrame({'Time':['02/Apr/2013:23:55:00 +0530']},index=['tst'])
>>> from functools import partial
>>> to_datetime_fmt = partial(pd.to_datetime, format='%d/%b/%Y:%H:%M:%S %z')

and apply this custom converter

>>> df['Time'] = df['Time'].apply(to_datetime_fmt)
>>> df.dtypes
Time    datetime64[ns]
dtype: object

Note, however that it works from python version 3.2, in earlier versions %z is unsupported. You have to add timedelta manually.

>>> from datetime import timedelta
>>> df = pd.DataFrame({'Time':['02/Apr/2013:23:55:00 +0530']},index=['tst'])

Split time into datetime and offset

>>> def strptime_with_offset(string, format='%d/%b/%Y:%H:%M:%S'):
...    base_dt = datetime.strptime(string[:-6], format)
...    offset = int(string[-6:])
...    delta = timedelta(hours=offset/100, minutes=offset%100)
...    return base_dt + delta
...

and apply this conversion function:

>>> df['Time'] = df['Time'].apply(strptime_with_offset)
>>> df['Time']
tst   2013-04-03 05:25:00
Name: Time, dtype: datetime64[ns]
>>> df.dtypes
Time    datetime64[ns]
dtype: object
like image 27
alko Avatar answered Sep 19 '22 08:09

alko