Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas reindex DataFrame with datetime objects

Is it possible to reindex a pandas DataFrame using a column made up of datetime objects?

I have a DataFrame df with the following columns:

Int64Index: 19610 entries, 0 to 19609
Data columns:
cntr                  19610  non-null values  #int
datflt                19610  non-null values  #float
dtstamp               19610  non-null values  #datetime object
DOYtimestamp          19610  non-null values  #float
dtypes: int64(1), float64(2), object(1)

I can reindex the df easily along DOYtimestamp with: df.reindex(index=df.dtstamp) and DOYtimestamp has the following values:

>>> df['DOYtimestamp'].values
    array([ 153.76252315,  153.76253472,  153.7625463 , ...,  153.98945602,
    153.98946759,  153.98947917])

but I'd like to reindex the DataFrame along dtstamp which is made up of datetime objects so that I generate different timestamps directly from the index. The dtstamp column has values which look like:

 >>> df['dtstamp'].values
     array([2012-06-02 18:18:02, 2012-06-02 18:18:03, 2012-06-02 18:18:04, ...,
     2012-06-02 23:44:49, 2012-06-02 23:44:50, 2012-06-02 23:44:51], 
     dtype=object)

When I try and reindex df along dtstamp I get the following:

>>> df.reindex(index=df.dtstamp)
    TypeError: can't compare datetime.datetime to long

I'm just not sure what I need to do get the index to be of a datetime type. Any thoughts?

like image 985
BFTM Avatar asked Jun 08 '12 05:06

BFTM


People also ask

How do I reindex pandas in DF?

Reindexing the columns using axis keyword One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.

How do you make a datetime column An index?

To get a new datetime column and set it as DatetimeIndex we can use the format parameter of the to_datetime function followed by the set_index function. The output above shows our DataFrame with DatetimeIndex. That's it!

What is the purpose of Reindexing in pandas?

The reindex() method allows you to change the row indexes, and the columns labels. Note: The values are set to NaN if the new index is not the same as the old.


1 Answers

It sounds like you don't want reindex. Somewhat confusingly reindex is not for defining a new index, exactly; rather, it looks for rows that have the specified indices. So if you have a DataFrame with index [0, 1, 2], then doing a reindex([2, 1, 0]) will return the rows in reverse order. Doing something like reindex([8, 9, 10]) does not make a new index for the rows; rather, it will return a DataFrame with NaN values, since there are no rows with indices 8, 9, or 10.

It seems like what you want is to just keep the same rows, but make a totally new index for them. For that you can just assign to the index directly. So try doing df.index = df['dtstamp'].

like image 133
BrenBarn Avatar answered Sep 22 '22 11:09

BrenBarn