Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get index of a row of a pandas dataframe as an integer

Assume an easy dataframe, for example

    A         B 0   1  0.810743 1   2  0.595866 2   3  0.154888 3   4  0.472721 4   5  0.894525 5   6  0.978174 6   7  0.859449 7   8  0.541247 8   9  0.232302 9  10  0.276566 

How can I retrieve an index value of a row, given a condition? For example: dfb = df[df['A']==5].index.values.astype(int) returns [4], but what I would like to get is just 4. This is causing me troubles later in the code.

Based on some conditions, I want to have a record of the indexes where that condition is fulfilled, and then select rows between.

I tried

dfb = df[df['A']==5].index.values.astype(int) dfbb = df[df['A']==8].index.values.astype(int) df.loc[dfb:dfbb,'B'] 

for a desired output

    A         B 4   5  0.894525 5   6  0.978174 6   7  0.859449 

but I get TypeError: '[4]' is an invalid key

like image 484
durbachit Avatar asked Dec 19 '16 07:12

durbachit


People also ask

How do I get Pandas row index?

There may be many times when you want to be able to know the row number of a particular value, and thankfully Pandas makes this quite easy, using the . index() function. Practically speaking, this returns the index positions of the rows, rather than a row number as you may be familiar with in Excel.

How can I get specific row number in Pandas?

To get the nth row in a Pandas DataFrame, we can use the iloc() method. For example, df. iloc[4] will return the 5th row because row numbers start from 0.


1 Answers

The easier is add [0] - select first value of list with one element:

dfb = df[df['A']==5].index.values.astype(int)[0] dfbb = df[df['A']==8].index.values.astype(int)[0] 

dfb = int(df[df['A']==5].index[0]) dfbb = int(df[df['A']==8].index[0]) 

But if possible some values not match, error is raised, because first value not exist.

Solution is use next with iter for get default parameetr if values not matched:

dfb = next(iter(df[df['A']==5].index), 'no match') print (dfb) 4  dfb = next(iter(df[df['A']==50].index), 'no match') print (dfb) no match 

Then it seems need substract 1:

print (df.loc[dfb:dfbb-1,'B']) 4    0.894525 5    0.978174 6    0.859449 Name: B, dtype: float64 

Another solution with boolean indexing or query:

print (df[(df['A'] >= 5) & (df['A'] < 8)])    A         B 4  5  0.894525 5  6  0.978174 6  7  0.859449  print (df.loc[(df['A'] >= 5) & (df['A'] < 8), 'B']) 4    0.894525 5    0.978174 6    0.859449 Name: B, dtype: float64 

print (df.query('A >= 5 and A < 8'))    A         B 4  5  0.894525 5  6  0.978174 6  7  0.859449 
like image 115
jezrael Avatar answered Oct 05 '22 21:10

jezrael