In pandas, I have a Dataframe indexed by timestamp. Looks like the following table:
A B C D
DATE
2018-01-17 157.52 163.74 157.28 159.84
2018-01-16 158.25 159.35 155.93 157.40
2018-01-15 157.15 159.59 156.79 158.86
2018-01-12 158.25 158.62 157.40 157.52
And I'm trying to select a row by index, and also select the next few rows. (For example, select two rows start at 2018-01-12). I found both .loc and .iloc are hard to do such task. Is there any other ways to do that?
To get the nth row in a Pandas DataFrame, we can use the iloc() method. For example, df. iloc[4] will return the 5th row because row numbers start from 0.
Solution #1: Using the DataFrame's index, followed by head(2)
:
df['2018-01-12':].head(2)
Solution #2: Using iloc
:
i = df.index.get_loc('2018-01-12')
df.iloc[i:i+2]
Bonus solution #3: It seems like you're analyzing stock data. Maybe you're interested in something that could be more efficiently done using rolling windows? (Moving Averages maybe?) Consider using rolling
, e.g. to calculate the rolling mean:
df.rolling(2).mean()
You should be able to do something like:
date_of_wanted_row = "2018-01-12"
iloc_of_wanted_row = df.index.get_loc(date_of_wanted_row)
return df.iloc[iloc_of_wanted_row: iloc_of_wanted_row + 4]
(but with more sensible variable names, and I'm assuming the dates in the index are not really strings)
You can use (to get values between dates):
df.loc['2018-01-12':'2018-01-14']
or, to get all data from January:
df.loc['2018-01']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With