Assume an easy dataframe, for example
A B 0 1 0.810743 1 2 0.595866 2 3 0.154888 3 4 0.472721 4 5 0.894525 5 6 0.978174 6 7 0.859449 7 8 0.541247 8 9 0.232302 9 10 0.276566
How can I retrieve an index value of a row, given a condition? For example: dfb = df[df['A']==5].index.values.astype(int)
returns [4]
, but what I would like to get is just 4
. This is causing me troubles later in the code.
Based on some conditions, I want to have a record of the indexes where that condition is fulfilled, and then select rows between.
I tried
dfb = df[df['A']==5].index.values.astype(int) dfbb = df[df['A']==8].index.values.astype(int) df.loc[dfb:dfbb,'B']
for a desired output
A B 4 5 0.894525 5 6 0.978174 6 7 0.859449
but I get TypeError: '[4]' is an invalid key
There may be many times when you want to be able to know the row number of a particular value, and thankfully Pandas makes this quite easy, using the . index() function. Practically speaking, this returns the index positions of the rows, rather than a row number as you may be familiar with in Excel.
To get the nth row in a Pandas DataFrame, we can use the iloc() method. For example, df. iloc[4] will return the 5th row because row numbers start from 0.
The easier is add [0]
- select first value of list with one element:
dfb = df[df['A']==5].index.values.astype(int)[0] dfbb = df[df['A']==8].index.values.astype(int)[0]
dfb = int(df[df['A']==5].index[0]) dfbb = int(df[df['A']==8].index[0])
But if possible some values not match, error is raised, because first value not exist.
Solution is use next
with iter
for get default parameetr if values not matched:
dfb = next(iter(df[df['A']==5].index), 'no match') print (dfb) 4 dfb = next(iter(df[df['A']==50].index), 'no match') print (dfb) no match
Then it seems need substract 1
:
print (df.loc[dfb:dfbb-1,'B']) 4 0.894525 5 0.978174 6 0.859449 Name: B, dtype: float64
Another solution with boolean indexing
or query
:
print (df[(df['A'] >= 5) & (df['A'] < 8)]) A B 4 5 0.894525 5 6 0.978174 6 7 0.859449 print (df.loc[(df['A'] >= 5) & (df['A'] < 8), 'B']) 4 0.894525 5 0.978174 6 0.859449 Name: B, dtype: float64
print (df.query('A >= 5 and A < 8')) A B 4 5 0.894525 5 6 0.978174 6 7 0.859449
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With