Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I properly set the Datetimeindex for a Pandas datetime object in a dataframe?

I have a pandas dataframe:

    lat         lng         alt days              date        time 0   40.003834   116.321462  211 39745.175405      2008-10-24  04:12:35 1   40.003783   116.321431  201 39745.175463  2008-10-24      04:12:40 2   40.003690   116.321429  203 39745.175521      2008-10-24      04:12:45 3   40.003589   116.321427  194 39745.175579      2008-10-24      04:12:50 4   40.003522   116.321412  190 39745.175637      2008-10-24      04:12:55 5   40.003509   116.321484  188 39745.175694      2008-10-24      04:13:00 

For which I am trying to convert the df['date'] and df['time'] columns into a datetime. I can do:

df['Datetime'] = pd.to_datetime(df['date']+df['time']) df = df.set_index(['Datetime']) del df['date'] del df['time'] 

And I get:

                    lat         lng         alt days Datetime                             2008-10-2404:12:35  40.003834   116.321462  211 39745.175405     2008-10-2404:12:40  40.003783   116.321431  201 39745.175463 2008-10-2404:12:45  40.003690   116.321429  203 39745.175521     2008-10-2404:12:50  40.003589   116.321427  194 39745.175579     2008-10-2404:12:55  40.003522   116.321412  190 39745.175637 

But then if I try:

df.between_time(time(1),time(22,59,59))['lng'].std() 

I get an error - 'TypeError: Index must be DatetimeIndex'

So, I've also tried setting the DatetimeIndex:

df['Datetime'] = pd.to_datetime(df['date']+df['time']) #df = df.set_index(['Datetime']) df = df.set_index(pd.DatetimeIndex(df['Datetime'])) del df['date'] del df['time'] 

And this throws an error also - 'DateParseError: unknown string format'

How do I create the datetime column and DatetimeIndex correctly so that df.between_time() works right?

like image 371
user3654387 Avatar asked Nov 20 '14 04:11

user3654387


People also ask

What is DatetimeIndex pandas?

class pandas. DatetimeIndex [source] Immutable ndarray of datetime64 data, represented internally as int64, and which can be boxed to Timestamp objects that are subclasses of datetime and carry metadata such as frequency information.


1 Answers

To simplify Kirubaharan's answer a bit:

df['Datetime'] = pd.to_datetime(df['date'] + ' ' + df['time']) df = df.set_index('Datetime') 

And to get rid of unwanted columns (as OP did but did not specify per se in the question):

df = df.drop(['date','time'], axis=1) 
like image 68
Kracit Avatar answered Sep 27 '22 18:09

Kracit