Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting single row as dataframe with DatetimeIndex

Tags:

python

pandas

I have a time series in a dataframe with DatetimeIndex like that:

import pandas as pd
dates= ["2015-10-01 00:00:00",
         "2015-10-01 01:00:00",
         "2015-10-01 02:00:00",
         "2015-10-01 03:00:00",
         "2015-10-01 04:00:00"]
df = pd.DataFrame(index=pd.DatetimeIndex(dates))
df["values"] = range(0,5)

Out[]:
                     values
2015-10-01 00:00:00       0
2015-10-01 01:00:00       1
2015-10-01 02:00:00       2
2015-10-01 03:00:00       3
2015-10-01 04:00:00       4

I would like to as simple clean as possible select a row looking like that, based on the date being the key, e.g. "2015-10-01 02:00:00":

Out[]:
                     values
2015-10-01 02:00:00       2

Simply using indexing results in a key error:

df["2015-10-01 02:00:00"]
Out[]:
KeyError: '2015-10-01 02:00:00'

Similarly this:

df.loc[["2015-10-01 02:00:00"]]
Out[]:
KeyError: "None of [['2015-10-01 02:00:00']] are in the [index]"

These surprisingly (?) result in the same series as follows:

df.loc["2015-10-01 02:00:00"]
Out[]:
values    2
Name: 2015-10-01 02:00:00, dtype: int32

df.loc["2015-10-01 02:00:00",:]
Out[]:

values    2
Name: 2015-10-01 02:00:00, dtype: int32

print(type(df.loc["2015-10-01 02:00:00"]))
print(type(df.loc["2015-10-01 02:00:00",:]))
print(df.loc["2015-10-01 02:00:00"].shape)
print(df.loc["2015-10-01 02:00:00",:].shape)
Out[]:
<class 'pandas.core.series.Series'>
<class 'pandas.core.series.Series'>
(1,)
(1,)

I could wrap any of those in DataFrame like that:

slize = pd.DataFrame(df.loc["2015-10-01 02:00:00",:])
Out[]:
        2015-10-01 02:00:00
values                    2

Of course I could do this to reach my result:

slize.T
Out[]:
                     values
2015-10-01 02:00:00       2

But as at this point, I could also expect a column as a series it is kinda hard to test if it is a row or columns series to add the T automatically. Did I miss a way of selecting what I want?

like image 333
Marcus V. Avatar asked Dec 13 '25 09:12

Marcus V.


2 Answers

I recommend to generate your index using pd.date_range for convenience, and then to use .loc with a Timestamp or datetime object.

from datetime import datetime

import pandas as pd

start = datetime(2015, 10, 1, 0, 0, 0)
end = datetime(2015, 10, 1, 4, 0, 0)
dates = pd.date_range(start, end, freq='H')
df = pd.DataFrame(index=pd.DatetimeIndex(dates))
df["values"] = range(0,5)

Then you can use .loc with a Timestamp or datetime object.

In [2]: df.loc[[start]]
Out[2]:
            values
2015-10-01       0

Further details

Simply using indexing results in a key error:

df["2015-10-01 02:00:00"]
Out[]:
KeyError: '2015-10-01 02:00:00'

KeyError occurs because you try to return a view of the DataFrame by looking for a column named "2015-10-01 02:00:00"

Similarly this:

df.loc[["2015-10-01 02:00:00"]]
Out[]:
KeyError: "None of [['2015-10-01 02:00:00']] are in the [index]"

Your second option cannot work with str indexing, you should use exact indexing as mentioned instead.

These surprisingly (?) result in the same series as follows:

df.loc["2015-10-01 02:00:00"]
Out[]:
values    2
Name: 2015-10-01 02:00:00, dtype: int32

If you use .loc on a single row you will have a coercion to Series type as you noticed. Hence you shall cast to DataFrame and then transpose the result.

like image 139
MCMZL Avatar answered Dec 15 '25 16:12

MCMZL


You can convert string to datetime - using exact indexing:

print (df.loc[[pd.to_datetime("2015-10-01 02:00:00")]])
                     values
2015-10-01 02:00:00       2

Or convert Series to DataFrame and transpose:

print (df.loc["2015-10-01 02:00:00"].to_frame().T)
                     values
2015-10-01 02:00:00       2
like image 29
jezrael Avatar answered Dec 15 '25 18:12

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!