Is it possible to reindex a pandas DataFrame
using a column made up of datetime objects?
I have a DataFrame df
with the following columns:
Int64Index: 19610 entries, 0 to 19609
Data columns:
cntr 19610 non-null values #int
datflt 19610 non-null values #float
dtstamp 19610 non-null values #datetime object
DOYtimestamp 19610 non-null values #float
dtypes: int64(1), float64(2), object(1)
I can reindex the df
easily along DOYtimestamp
with: df.reindex(index=df.dtstamp)
and DOYtimestamp
has the following values:
>>> df['DOYtimestamp'].values
array([ 153.76252315, 153.76253472, 153.7625463 , ..., 153.98945602,
153.98946759, 153.98947917])
but I'd like to reindex the DataFrame along dtstamp
which is made up of datetime objects so that I generate different timestamps directly from the index. The dtstamp
column has values which look like:
>>> df['dtstamp'].values
array([2012-06-02 18:18:02, 2012-06-02 18:18:03, 2012-06-02 18:18:04, ...,
2012-06-02 23:44:49, 2012-06-02 23:44:50, 2012-06-02 23:44:51],
dtype=object)
When I try and reindex df
along dtstamp
I get the following:
>>> df.reindex(index=df.dtstamp)
TypeError: can't compare datetime.datetime to long
I'm just not sure what I need to do get the index to be of a datetime type. Any thoughts?
Reindexing the columns using axis keyword One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.
To get a new datetime column and set it as DatetimeIndex we can use the format parameter of the to_datetime function followed by the set_index function. The output above shows our DataFrame with DatetimeIndex. That's it!
The reindex() method allows you to change the row indexes, and the columns labels. Note: The values are set to NaN if the new index is not the same as the old.
It sounds like you don't want reindex. Somewhat confusingly reindex
is not for defining a new index, exactly; rather, it looks for rows that have the specified indices. So if you have a DataFrame with index [0, 1, 2]
, then doing a reindex([2, 1, 0])
will return the rows in reverse order. Doing something like reindex([8, 9, 10])
does not make a new index for the rows; rather, it will return a DataFrame with NaN
values, since there are no rows with indices 8, 9, or 10.
It seems like what you want is to just keep the same rows, but make a totally new index for them. For that you can just assign to the index directly. So try doing df.index = df['dtstamp']
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With